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^ VIRULENCE GENES OF M. MARINUM and M. TUBERCULOSIS 

Background of the Invention 

Mycobacteria are bacterial organisms which are implicated in diseases such as, e.g., 
tuberculosis. It would be desirable to provide means for treating or preventing conditions 
5 caused by such mycobacteria, e.g., by immunization. 

Description of the Invention 

This invention relates, e.g., to virulence genes of mycobacteria. The invention 
provides methods to identify and isolate virulence genes of, for example, Mycobacterium 
marinum, 2l fish bacterium, and Mycobacterium tuberculosis, the primary etiologic agent 

1 0 of human tuberculosis. The invention also provides methods to mutagenize such virulence 
genes, thereby allowing the generation and isolation of avirulent mycobacteria. The 
invention also relates to isolated virulence genes and variants and fragments thereof; to 
isolated virulence gene products and variants and fragments thereof; to mutant, avirulent, 
bacteria; to attenuated vaccines comprising the mutant bacteria; and to methods to elicit 

15 an immune response in a host, using such mutant bacteria. 



One embodiment of the invention is a method for identifying a virulence gene of 
M. marinum, comprising 

a) mutagenizing M. marinum bacteria by introducing into said bacteria a plasmid 
which comprises a tagged {e.g., signature-tagged) transposon, whereby the transposon 

20 integrates into and disrupts a gene in the bacteria, 

b) introducing said mutagenized bacteria into a host susceptible to infection thereof 
{e.g., a goldfish), 

c) identifying a mutagenized bacterium which comprises a tagged transposon and 
which exhibits reduced viability in the host, compared to other mutagenized or (non- 
25 mutagenized) M. marinum bacteria, 

d) cloning and/or sequencing (characterizing) a nucleic acid sequence which flanks 
the integrated transposon in said identified mutagenized bacterium, and 

e) identifying a wild type M. marinum gene which comprises at least a portion of 
said flanking sequence. 

30 Of course, the above method can be carried out using one or more of the steps, in 

any order, effective to achieve the intended purpose. 



WO 01/19993 




PCT/US00/25512 



Another embodiment is a method for identifying a virulence gene of M. 
tuberculosis, comprising identifying an M marinum virulence gene as described above, 
and further comprising, 

comparing said flanking nucleic acid sequence to a databank of M tuberculosis 
nucleic acid sequences, and/or comparing the sequences of peptides which are coded for 
by said flanking sequences to a known M. tuberculosis protein database, and 

identifying an M. tuberculosis gene which comprises a sequence that is 
substantially identical to said flanking sequences and/or polypeptides encoded by them. 
In other embodiments, the degree of identity can be less than substantially identical, e.g., 
about 35-50%, or about 50-70%, or about 70-90%. 

Another embodiment is a method for isolating a mutagenized M. marinum 
bacterium which exhibits reduced virulence in a host susceptible to infection thereof 
compared to a non-mutagenized M marinum bacterium, comprising integrating a tagged 
(e.g., signature-tagged) transposon into the DNA of a M. marinum bacterium in a manner 
effective to produced reduced virulence, and isolating said mutagenized bacterium. 

Another embodiment is an avirulent M marinum bacterium in which one or more 
genes comprising a nucleic acid of SEQ ID NOs: 4, 6, 8, 10, 11, 13, 17,21,23,25,27, 29, 
31, 35, 39, 41, 43 or 44 are mutated. Another embodiment is a pharmaceutical 
composition or an attenuated vaccine comprising such an avirulent M marinum bacterium 
and a pharmaceutical^ acceptable carrier. 

Another embodiment is an avirulent M. tuberculosis bacterium in which one or 
more virulence genes identified as described above are mutated. Another embodiment is 
an avirulent M tuberculosis bacterium in which one or more of the genes encoding 
proteins Rv0822c, CY20G9.23 (Rv0497), the pks family, including e.g., ppsE (Rv2935), 
psk6 (Rv0405), pks9 (Rvl664), pks8 (Rvl662), pksl (Rv2946c), and pks002c, Rv351 1, 
008381 (Rv0357c), Rv3775, Rv3137, Rv2348c, Rv3860, mbtB (Rv2383c), Rv2181, 
Rvl954c, Rv0987, Rv3268, Rv2610c, nrp (pir E70751, RvOlOl), mbtE (Rv2380c), 
Rv0236c or smc (Rv2922c) are mutated. Another embodiment is a pharmaceutical 
composition or an attenuated vaccine comprising one or more of the above avirulent M. 
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tuberculosis bacteria {e.g., an M tuberculosis strain constructed with one or more 
mutations in one or more of the above virulence genes) and a pharmaceutical^ acceptable 
carrier. 

Another embodiment is an isolated nucleic acid of M. marinum comprising an 
oligonucleotide of SEQ ID NOs: 4, 6, 8, 10, 11, 13, 17, 21, 23, 25, 27, 29, 31, 35, 39, 41, 
43 or 44, or a variant or fragment thereof. Another embodiment is a nucleic acid which 
is complementary to at least a portion of said isolated M. marinum nucleic acid, or which 
can hybridize to at least a portion of said isolated M. marinum nucleic acid under selected- 
(e.g., high) stringency conditions. In other embodiments, the isolated M. marinum nucleic 
acid is a gene; or the isolated M. marinum nucleic acid or fragments thereof are cloned 
into, and/or expressed in, an expression vector. 

Another embodiment is an isolated nucleic acid of M. tuberculosis, comprising a 
virulence gene identified as above, or a variant or fragment thereof Another embodiment 
is a nucleic acid which is complementary to at least a portion of said isolated M. 
tuberculosis nucleic acid, or which can hybridize to at least a portion of said isolated M. 
tuberculosis nucleic acid under selected (e.g., high) stringency conditions. In other 
embodiments, the isolated M. tuberculosis nucleic acid or fragments thereof are cloned 
into, and/or expressed in, an expression vector. 

Another embodiment is a method to elicit an immune response in a fish, 
comprising introducing into the fish an avirulent M marinum bacterium made (e.g., 
isolated, constructed) as described above. Another embodiment is a method to elicit an 
immune response in a human or non-human animal (e.g., domestic or farm animal, such 
as a cow) host, comprising introducing into said host an avirulent, M. tuberculosis 
bacterium, in which one or more virulence genes identified as described above are 
mutated. Another embodiment is a method to elicit an immune response in a human host, 
comprising introducing into such host an avirulent M. tuberculosis bacterium in which one 
or more of the genes encoding proteins Rv0822c, CY20G9.23, the pks family of proteins, 
Rv351 1, 008381, Rv3775, Rv3137, Rv2348c, Rv3860, mbtB, Rv2181, Rvl954c, Rv0987, 
Rv3268, Rv2610c, nrp (pir E70751), mbtE, Rv0236c or smc is mutated. 



I 
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A wide variety of Mycobacteria species can be used in the invention. In a most 
preferred embodiment, the bacterium is Mycobacterium marinum (M marinum), which 
causes fish tuberculosis, as well as, in humans, skin infection or localized nodular and 
ulcerated lesions (mariner's tuberculosis) on the extremities and, in immunocompromised 
5 patients, systemic disease; Mycobacterium tuberculosis (M. tuberculosis), the primary 
etiologic agent for tuberculosis (TB) in man; or Mycobacterium bovis (M bovis), which 
causes human or bovine tuberculosis. Other species of Mycobacterium which can be used 
in the invention include, e.g., M. bovis BCG, M africanum, M. leprae, M. microti, M. 
smegmatis, M. vaccae, M. ulcer ans, M. haemophilum, M. fortuitum, M. chelonae, and 
10 others. 

The term "virulent" in the context of mycobacteria refers to a bacterium or strain 
of bacteria that replicates within a host cell or animal within the mycobacterium host range 
at a rate which is detrimental to the cell or animal, or that induces a host response which 
is detrimental. More particularly, virulent mycobacteria persist longer in a host than 
15 avirulent bacteria. Virulent mycobateria are typically disease producing; and infection 
leads to various disease states including fulminant disease in the lung, disseminated 
systemic milliary tuberculosis, tuberculosis meningitis, and/or tuberculosis abscesses of 
various tissues. Infection by virulent mycobacteria often results in death of the host 
organism. 

20 By contrast, the term "avirulent," as used herein, refers to a bacterium or strain of 

bacteria that does not replicate within a host cell or animal within its host range; replicates 
at a rate which is not significantly detrimental to the cell or animal; and/or does not induce 
a detrimental host response. An avirulent {e.g., attenuated, non-pathogenic) strain is 
incapable of inducing a full suite of symptoms of the disease that is normally associated 

25 with its virulent pathogenic counterpart. Avirulent bacteria exhibit a reduced ability, or 
an inability, to survive in a host, but not all bacteria which exhibit such an impaired ability 
to survive in a host are avirulent. For example, in a simultaneous in vivo test of several 
mutant bacteria, certain mutants which are unable to compete with other mutants may not, 
when tested in the presence of the other strains, replicate efficiently or survive in the host; 

30 however, such bacteria, when tested individually, may prove to be virulent. An avirulent 
bacterium can contain one or more mutations in one or more virulence genes. 
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A "virulence gene" encodes a gene product ("virulence factor, virulence 
determinant") which contributes, directly or indirectly, to infection (e.g., attachment, 
invasion, transport into the cell, replication, etc.) and/or to tissue destruction and/or 
disease. A virulence gene can code for or modify, e.g., an adhesion molecule or other 
5 molecule which aids in the attachment to or invasion of a host cell; a toxin (e.g., a secreted 
factor which can cause lysis or damage of a host cell ~ for example, a small molecule such 
as a polyketide, or an enzyme such as a phospholipase, lipase, esterase or protease); a 
factor required for efficient secretion of such a toxin; a factor involved in intracellular 
multiplication or growth; a factor involved in resistance to host defenses; a factor which 

1 0 can stimulate a host cell to produce an inflammatory product or cytokine that can amplify 
tissue damage in a host; or a factor which regulates the production and/or activity of a 
virulence factor. Also included are certain functions which resemble "housekeeping" 
functions, e.g., functions which allow bacteria to provide nutrients that are limiting in a 
host, such as factors which aid in the acquisition of iron, or certain enzymes of purine or 

15 pyrimidine biosynthesis. For a review of some of the putative or suspected virulence 
determinants of Mycobacterium tuberculosis, see Quinn et al (1996). Curr. Top. 
Microbiol. Immunol. 21S, 131-156. 

By a "host" for a bacterium is meant an organism, or a cell or tissue of an 
organism, which can be infected by the bacterium and which exhibits consequences of that 

20 infection. For example, Mycobacterium marinum can infect and cause symptoms in the 
frog (Rana pipiens) or in any of about 150 fresh-water or salt-water species of fish. In an 
especially preferred embodiment, the host for Mycobacterium marinum is the goldfish, 
Carassius auratus. Well-established animal models for M. tuberculosis include, e.g., 
guinea pig, mouse, rabbit and monkey; and many natural hosts exist for that bacterium, 

25 including large animals such as the elephant. Many other bacteria/host combinations are 
possible. See, e.g., B. Bloom, ed., (1994). Tuberculosis: Pathogenesis, Protection, and 
Control, ASM Press, Washington, D.C. Chapter 1 1 , for a discussion of tuberculosis in wild 
and domestic animals. 
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A system in which goldfish are infected by M. marinum (the "goldfish model") 
offers a number of advantages for experimental studies. For example, M. marinum has a 
generation time of only 4 hours (as compared, e.g., to the greater than 20 hour generation 
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time of M tuberculosis), and studies with M. marinum can be carried out in a Biosafety 
Level 2 facility (whereas a Biosafety Level 3 facility is required, e.g., for studies with Af. 
tuberculosis). M. marinum can serve as an appropriate surrogate model for the study of 
M. tuberculosis. M. marinum and the M. tuberculosis complex have been shown to be 
closely related by, e.g., DNA hybridization and 16S rRNA gene sequence analysis (see, 
e.g., Tonjum et al (1998). J. of Clinical Microbiology 3£, 918-925). The disease 
progression and symptoms of fish infected with M. marinum mimic those of humans 
infected with M. tuberculosis: in both types of hosts, organs in all parts of the body can be 
infected; both bacteria replicate within macrophages and reside in an endosomal 
compartment which is nonacidic and does not fuse with the lysosomal compartment; and 
both bacteria readily kill macrophages. 

Examples IB and 1C show, e.g., that the pathology in the goldfish model parallels 
that of human tuberculosis. Depending on the dose of M. marinum organisms which is 
inoculated into a fish, acute or chronic disease is elicited. The pathology of the acute 
disease includes severe peritonitis and necrosis with all animals dying within 17 days of 
infection. The pathology of the chronic disease includes progressive granuloma formation. 
Granulomas with different histopathological features (necrotizing, non-necrotizing and 
caseous) are seen in the experimentally infected goldfish, which is consistent with the 
granuloma types seen in naturally infected animals and parallels the types of granulomas 
found in human tuberculosis. Isolation of M marinum from fish tissue is possible 
throughout the course of the experiment presented in Example 1 (up to 16 weeks) 
indicating, as in human tuberculosis, the persistence of the organisms in the host. Example 
2 shows that the goldfish model can be used to distinguish virulent and avirulent forms of 
M. marinum. Further disclosure of how to make the goldfish model, and how to use it, 
e.g., to characterize molecular pathogenesis, can be found, e.g., in Talaat A.M. et al 
(1998). Infection and Immunity 66, 2938-2942. 

As an initial step in isolating virulence mutants, bacteria, e.g., M marinum, can be 
mutated by any of a variety of routine procedures which are well-known in the art, e.g., 
exposure to chemical agents, irradiation, genetic engineering, transposon mutagenesis, or 
the like. As used in this application, the term a "mutation" means any change (in 
comparison with the appropriate parental strain) in the DNA sequence of an organism, e.g., 
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a single (or multiple) base change, insertion, deletion, inversion, translocation, duplication, 
or the like. A mutation can be polar or non-polar, a frameshift or in phase. Preferably, in 
particular when a mutated bacterium is used as part of a treatment regimen or a vaccine, 
the mutation is substantially incapable of reverting to the wild type. 

In a most preferred embodiment, mutagenesis is carried out by a transposon 
mutagenesis system that carries sequence-specific tags, sometimes known as signature- 
tagged mutagenesis (STM). The unique tag sequence allows differentiation of individual 
mutants among an inoculum pool of mutants. The STM protocol permits the screening of 
a large number of mutants using a small number of animals. This method was developed 
by Hensel et al (Hensel et al (1995). Science 262, 400-403; U.S. Pat. No. 5,876,931 to 
Holden). Variations of the method and procedures for using it to isolate bacterial virulence 
mutants are also disclosed in, e.g., Shea et al (1996). Proc. Natl Acad. Sci. 22, 2593- 
2597; Mei^a/(1997). Mol Microbiol. 26, 399-407; Schwan et al (1998). Infec. Immun. 
66, 567-572; and Chiang et al (1998). Mol Microbiol. 22, 797-805. Example 3 shows the 
use of the STM system for the mutagenesis of M. marinum. 

Any of a variety of methods can be used to generate a bank of plasmids carrying 
unique signature-tagged transposons. A most preferred embodiment is shown in Example 
3A. Here, 96 independent, non-cross-hybridizing, signature-tagged transposons, each of 
which is hybridization- and amplification-efficient, are cloned into a mycobacteria suicide 
vector which carries a selectable marker. Many variants of such vectors, carrying any of 
a variety of selectable markers, can be used, of course. In example 3 A, the marker is a 
kanamycin-resi stance gene. 

To generate a mutant mycobacterium library, plasmids from a master plasmid 
collection are introduced individually {e.g., separately) into mycobacteria, preferably M. 
marinum, by any of a variety of routine, art-recognized techniques (e.g., phage 
transduction, shooting a "gene gun," electroporation, or other conventional techniques). 
In a most preferred embodiment, as shown in Example 3C, plasmids are introduced into 
M. marinum by electroporation. Any desired number of transformed bacteria can be 
selected from each transformation. In Example 3C, ninety-six transformations are 
performed, one with each of the 96 master plasmids; and ten independent trans form ants 
are selected from each transformation, to yield a library of 960 transformants. As Example 
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3B shows, the transposons integrate randomly into the M. marinum chromosome. In the 
ideal circumstance, each integrated transposon disrupts a different gene, or a different 
portion thereof, to create a library of, in this example, 960 differently mutagenized 
bacteria. 

Pools of mutagenized bacteria, each of which can be detected independently by 
virtue of its unique signature tag, are introduced into an appropriate host, e.g., a goldfish 
(an "input pool")- Bacteria may be introduced into an animal by any route, e.g., orally, 
intraperitoneal^, intravenously or intranasally; for fish, the preferred routes of 
administration are oral or, most preferably, intraperitoneal. It may be useful to compare, 
e.g., virulence genes identified by oral administration to those identified by intraperitoneal 
administration, as some genes may be required to establish infection by one route but not 
by the other. Bacteria are left in the host for a suitable length of time, which is a function 
of both the microorganism and the host. A method for optimization of some of the 
infection parameters for the M. marinum/ 'goldfish system is shown, e.g., in Examples 1 and 
2. 

Assays are performed to determine whether the bacteria are able to survive in the 
host during the period of infection. Any of a variety of such assays can be used, e.g., 
subtractive hybridization, differential display, or the like. In a most preferred embodiment, 
as shown in Example 4A, after an optimized period of infection by a pool of M. marinum 
mutants, fish are sacrificed and one or more internal organs, e.g., spleen, liver, kidney, 
peritoneum, heart, pancreas, or other organs evident to one of skill in the art, are cultured 
to isolate the mutant bacteria which were able to survive in the fish, defined as the output 
pool. A hybridization protocol to identify mutants present in the input and output pools 
is described in Example 4A. Mutants which are present in the input pool, but which 
cannot be detected after a predetermined time of infection has elapsed in the output pool, 
are candidates for avirulent mutants, i.e., mutants which are unable to infect, replicate 
and/or cause damage, in a particular cell type or tissue. 

In order to confirm that an M. marinum mutant is avirulent, each putative virulence 
mutant can be re-examined individually, e.g., in the goldfish model. In a preferred 
embodiment, the median survival time (MST) of goldfish infected with a lethal dose (about 
5xl0 8 cfu) of a putative virulence mutant can be determined, and those mutants which 
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allow goldfish to survive longer than fish inoculated with an equivalent dose of wild type 
organisms are categorized as putative virulence mutants. Many other types of screening 
assays can be used, including Competitive Indices, histopathology examinations of one or 
more of the organs described above, colony counts in organ homogenates, and analysis of 
the ability of a mutant to induce granuloma formation. Representative protocols for each 
of these methods are described, e.g., in Example 4B. In addition to confirming the 
existence of a virulence mutant, data collected on each mutant can yield clues to the 
pathogenesis pathways of M. marinum in the goldfish model. Methods to show that 
Koch's postulates have been fulfilled (proving that a postulated virulence gene is 
responsible for disease symptoms) are routine; one such method is presented in Example 
8. 

Alternative approaches to the STM technique can be used to identify avirulent M 
marinum mutants. For example, one can screen a library of M. marinum cosmids in M 
smegmatis. In the goldfish model, M. smegmatis does not persist in tissue when 
inoculated at a dose of 10 7 organisms/fish. This is in contrast to M. marinum, which can 
be isolated from fish tissue throughout the course of a 56 day experiment. In this 
alternative approach, one can inject the fish with pools of the M. marinum cosmids in M 
smegmatis and look for those which survive in the animal. A library of M marinum 
cosmids in M smegmatis can be obtained routinely, using standard, art-recognized 
procedures. 

Once an insertionally mutated M. marinum bacterium has been identified as being 
a (putative) virulence mutant, a wild type M. marinum can be engineered to contain a more 
well-defined (e.g., non-polar) mutation. The introduction of such a well-defined mutation 
into a new genetic background can confirm that the original phenotype was the result of 
the transposition event, rather than a secondary mutation. Furthermore, a well-defined 
mutation can be used to ascertain the presence, if any, of polarity effects. For example, 
the insertion of a transposon into a gene which is part of an operon can have polar effects 
on downstream genes in the operon. One method to determine if a given defect results 
from inactivation of the gene into which a transposon integrated, or if the actual virulence 
gene(s) lies downstream of the integration site, is to generate a small, in-frame, non-polar, 
deletion or insertion into a wild type correlate of the gene into which the transposon had 
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integrated. If such a mutant, when tested, for example as described above in the fish 
model, does not exhibit an avirulent phenotype, other genes in the operon can be mutated 
and analyzed in the same manner until one (or more) virulence genes are identified. That 
is, nucleic acid sequences which flank the integrated transposon can be cloned and 
sequenced in several sequential steps (e.g., one can "walk" down an operon) until a 
virulence gene is identified. Of course, the invention includes genes which lie downstream 
of a gene in which a polar mutation results in an avirulent phenotype. Such genes can be 
considered to be "genes of the invention" or "genes identified by methods of the 
invention." 

As a first step in performing site-specific mutagenesis of a gene of interest, it is 

preferable to isolate (e.g., clone) at least a portion of the corresponding wild type gene. 

If the gene is part of an operon, some, if not all, of the other genes in the operon can also 

be isolated. As used in this application, the term "isolated" (referring, e.g., to a gene or 

gene product, nucleic acid, protein, bacterium, etc.) means being in a non-naturally- 

occurring form. Methods to clone genes, particularly those containing a unique marker, 

are routine for one of ordinary skill in the art. (See, e.g., Sambrook, J. et al (1989). 

Molecular Cloning, a Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold 

Spring Harbor, NY; Ausubel, F.M. et al (1995). Current Protocols in Molecular Biology, 

N.Y., John Wiley & Sons; Davis et al (1986), Basic Methods in Molecular Biology, 

Elsevir Sciences Publishing, Inc., New York; Hames et al (1985), Nucleic Acid 

Hybridization, TL Press; Dracopoli, N.C. et al, Current Protocols in Human Genetics, 

John Wiley & Sons, Inc.; and Coligan, J.E. et al, Current Protocols in Protein Science, 

i 

John Wiley & Sons, Inc for many of the molecular biology techniques referred to in this 
application, including isolating, cloning, modifying, labeling, manipulating, sequencing, 
and otherwise treating or analyzing nucleic acid and/or protein.). In one method, clones 
comprising a gene(s) of interest can readily be identified and isolated from a wild type 
library (e.g., a cosmid library, Bacterial Artificial Chromosome (BAC) library (Brosch, 
R. et a/(1998). Infect. Immun. 66, 2221-2229; Philipp, W.J. et al (1996). PNAS91, 3132- 
37), phage library, cDNA library, or the like), using conventional, routine, procedures in 
the art. Methods for subcloning a gene(s) of interest are also routine for one of ordinary 
skill in the art. 
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Example 6 describes a preferred embodiment of the invention, in which a 
hybridization probe corresponding to gene sequences flanking the site of transposon 
integration in an M. marinum mutant is used to screen a cosmid library of wild type M. 
marinum genes. Because many M. marinum genes are about 2 kb in size, and the average 
5 DNA insert in a cosmid library can be about 30-40 kb, it is likely that a cosmid clone so 
identified will contain the entire operon, if any, in which the gene of interest is located. 
It is understood, of course, that the genes and clones referred to in this application 
typically are double-stranded; therefore, a probe "corresponding to" a given sequence can 
be designed to hybridize to either of the strands of the DNA duplex, or to a nucleic acid 
10 (e.g., RNA or cDNA) which is complementary to one strand of the duplex. 

The term "a cloned gene," as used herein, can encompass not only the regions of 
DNA that code for a polypeptide but also regulatory regions of DNA such as regions of 
DNA that regulate transcription, translation and, for some microorganisms, splicing of 
RNA. Thus, a "gene" can include promoters, transcription terminators, ribosome-binding 
1 5 sequences and, for some organisms, introns and splice recognition sites. A cloned "gene" 
as used herein can be, e.g., a genomic or a cDNA gene, or a rRNA or tRNA gene, or the 
like. 

After a gene of interest, or a portion thereof, has been cloned, defined mutation(s) 
can be introduced into it, using methods of site-specific mutagenesis which are well- 

20 known in the art. Any type of mutation, for example those defined above, can be 
introduced into a cloned gene of interest. In a preferred embodiment, a wild type, cloned 
M marinum virulence gene is mutated such that an insertion or deletion (ranging from 
about 3 bases to about 90% of the entire gene sequence, preferably about 99 to about 4000 
bases, most preferably about 500 bases) is introduced in such a way that the coding 

25 sequences remain in phase (i.e., the insertion or deletion is a multiple of 3 bases). In a 
most preferred embodiment, the mutation is an insertion of a nucleic acid fragment which 
comprises a kanomycin resistance marker. The site of the mutation can be chosen at will, 
but it is preferably in the 5'-terminal half of the gene. The availability of convenient 
restriction sites in the gene can simplify the introduction of mutations. 
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The mutated DNA can be reintroduced into the M. marinum genome by any of a 
variety of well-characterized methods. In a most preferred embodiment, the mutation is 
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introduced into the genome by allelic exchange (homologous recombination). Methods 
for using long linear recombination substrates for allelic exchange in Mycobacteria are 
provided, e.g., in Balasubramanian, V. et al (1996). /. BacterioL 12&, 273-279. Other 
methods for homologous recombination are found, e.g., in Aldovini, A.R. et al (1993). J. 
5 BacterioL 111, 7282-7289; Norman, E. et al (1995). Mol. Microbiol. 16, 755-760; 
Baulard, A. et al (1996). J. BacterioL 12&, 3091-3098; Marklund, B.I. et al (1995). J. 
BacterioL 122, 6100-6105; Ramakrishnan, L. et al (1997). J. BacterioL 122, 5862-5868; 
and U.S. Pat. No. 5,700,683. 

Simultaneously with the characterization of a virulence defect in an M. marinum 

10 mutant, or prior or subsequent to such characterization, the gene which is disrupted by the 
transposon insertion can be identified and characterized. In one embodiment, regions 
flanking one or both sides of an integrated transposon are characterized by hybridization 
to a panel of selected sequences. In a most preferred embodiment, the flanking regions are 
sequenced in order to identify the gene which has been disrupted. Many sequencing 

1 5 methods are, of course, well-known to those of ordinary skill in the art. Example 5 
describes two methods to sequence directly the flanking regions, as well as methods to 
first clone and then sequence such regions. In a most preferred embodiment, genomic 
sequences flanking a transposon are amplified using a strategy called ligation-mediated 
PCR (LMPCR) (Prod'hom et al (1998). FEMS Microbiology Letters 75-81). Briefly, 

20 this method uses one primer specific for the known sequence (IS (insertion sequence) 
present on both ends of the transposon) and a second specific for a synthetic linker ligated 
to restricted genomic DNA. This method is illustrated in Figures 1 1 A and B. The size 
of the flanking regions which can be analyzed are limited by factors such as the fragment 
size that can be amplified by PCR, and can be readily determined by one of skill in the art. 

25 In a most preferred embodiment, a flanking region is about 100 to about 1 ,000 bases long. 

The comparison of sequences of previously uncharacterized virulence genes in M. 
marinum to sequences in publicly available DNA and protein databases from a variety of 
sources (e.g., GenBank, EMBL, DDBJ, SWISS-PROT, PRF, PDB, RefSeq, etc.) can aid 
in the identification of (functional) homologues, and can add insight into the role a 
30 virulence gene plays in the molecular pathogenesis pathways of mycobacteria in an animal 
host. 
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Optimal alignment of sequences may be conducted by the local homology 
algorithm of Smith and Waterman (1981). Adv. Appl. Math. 2^482; by the homology 
alignment algorithm of Needleman and Wunsch (1970). J. Mol Biol. 4&, 443; by the 
search for similarity method of Pearson and Lipman (1988). Proc. Natl. Acad. Sci. £5, 
2444; or by computerized implementations of these algorithms {e.g., GAP, BESTFIT, 
FASTA, and TFASTA in the Wisconsin Genetics Software Package Release 7.0., Genetics 
Computer Group, 575 Science Dr. Madison, Wis.) Other such computer programs 
include, e.g. , BLAST and FASTA (Altschul, S.F. et al (1990). J. Mol. Biol. 211, 403-410); 
BLASTX; TBLASTN; Gapped BLAST and PSI-BLAST (Altschul, S.F. et al (1997), 
Nucleic Acids Res. 25, 3389-3402). Alternatively, the sequences can be aligned by 
inspection. The best alignment {i.e., resulting in the highest percentage of sequence 
similarity over the comparison window) generated by the various methods is selected. In 
a most preferred embodiment, the BLAST blastx program is used. 

Typically, a polynucleotide sequence of interest is translated into all six possible 
reading frames and is searched with the NCBI Blast search, selecting blastx. This 
translated sequence is first run against the EMBL data base to identify functional 
homologs. Then, if desired, the sequence is searched with the advanced Blast program, 
against Mycobacterium sequences in particular. In a preferred embodiment, sequences 
identified by such a homology alignment exhibit substantial identity to the sequence of 
interest. Of course, any selected degree of sequence identity can be the basis of such a 
comparison, e.g., about 30-50%, about 50-70% or about 70-90% sequence identity at the 
nucleotide or amino acid level. 

The following terms are used to describe the sequence relationships between two 
or more polynucleotides or polypeptides: "reference sequence," "comparison window," 
"sequence identity," "percentage of sequence identity," and "substantial identity." 

A "reference sequence" is a defined sequence used as a basis for a sequence 
comparison; a reference sequence may be a subset of a larger sequence, for example, a 
segment of a full-length cDNA or gene sequence given in a sequence listing, or may 
comprise a complete cDNA or gene sequence. Generally, a reference is at least about 1 0 
nucleotides in length, frequently at least about 20 to 25 nucleotides in length, and often at 
least about 50 nucleotides in length. In a preferred embodiment, a reference sequence is 
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at least about 100 nucleotides in length, frequently at least about 150-300 nucleotides in 
length. Sequence comparisons between two (or more) polynucleotides are typically 
performed by comparing sequences of the two polynucleotides over a "comparison 
window" to identify and compare local regions of sequence similarity. A "comparison 
window," as used herein, refers to a segment of at least about 10 contiguous nucleotide 
positions wherein a polynucleotide sequence may be compared to a reference sequence of 
at least about 10 contiguous nucleotides and wherein the portion of the polynucleotide 
sequence in the comparison window may comprise additions and deletions (i.e. gaps) of 
about 20 percent or less as compared to the reference sequence (which does not comprise 
additions or deletions) for optimal alignment of the two sequences. 

The term "sequence identity" means that two polynucleotide or polypeptide 
sequences are identical (e.g., on a nucleotide-by-nucleotide or amino acid-by-amino acid 
basis) over the window of comparison. The term "percentage of sequence identity" is 
calculated by comparing two optimally aligned sequences over the window of comparison, 
determining the number of positions at which the identical nucleic acid base (e.g., A, T, 
C, G, U, or I) or amino acid residue occurs in both sequences to yield the number of 
matched positions, dividing the number of matched positions by the total number of 
positions in the window of comparison (Le. 9 the window size), and multiplying the result 
by 100 to yield the percentage of sequence identity. The term "identical" in the context of 
two nucleic acid or polypeptide sequences refers to the residues in the two sequences 
which are the same when aligned for maximum correspondence. 

The term "substantial identity" or "substantial similarity" indicates that a nucleic 
acid or polypeptide comprises a sequence that has at least about 90% sequence identity to 
a reference sequence, or preferably at least about 95%, or more preferably at least about 
98% sequence identity to the reference sequence, over a comparison window of at least 
about 10 to about 100 or more nucleotides or amino acid residues. An indication that two 
polypeptide sequences are substantially identical is that one protein is immunologically 
reactive with antibodies raised against the second protein. An indication that two nucleic 
acid sequences are substantially identical is that the polypeptide which the first nucleic 
acids encodes is immunologically cross reactive with the polypeptide encoded by the 
second nucleic acid. 
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Another indication that two nucleic acid sequences are substantially identical is 
that the two molecules hybridize to each other under selected high stringent conditions. 
High stringent conditions are sequence-dependent and will be different with different 
environmental parameters. Generally, high stringent conditions are selected to be about 
5°C. to 20°C. lower than the thermal melting point (T„,) for the specific sequence at a 
defined ionic strength and pH. The T m is the temperature (under defined ionic strength and 
pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. 
Typically, high stringent conditions will be those in which the salt concentration is at least 
about 0.2 molar at pH 7 and the temperature is at least about 60°C. 

Analyses of the peptides or proteins which can be translated from flanking DNA 
sequences can be particularly informative for identifying functional homologues. The 
similarity between two polypeptides is determined by comparing the amino acid sequence 
and its conserved amino acid substitutes of one polypeptide to the sequence of a second 
polypeptide. Alignment procedures such as those discussed above can be used. 

The sequencing and characterization of regions flanking thirteen transposons which 
have independently integrated into M. marinum, rendering the bacteria avirulent in the 
goldfish model, is shown in Example 9. At least six of the M Marinum mutant genes are 
closely related to a previously identified functional homologue(s) from another organism, 
e.g., a transcriptional regulator from Streptomyces coelicolor which belongs to the AraC 
family of transcriptional regulators; an integral membrane protein; polyketide synthase 
genes from Streptomyces and Pseudomonas bacteria; a sulfate adenylyltransferase with 
homology to diverse organisms including Pyrococcus abyssi, Synechocytis. sp., and 
Bacillus subtilis; a cysQ gene, or dhbF from B. subtilis. The possible significance of these 
functional properties for M. marinum virulence is discussed in Example 9. 

The flanking sequences in M marinum can also be compared in a similar manner 
to databanks of mycobacteria sequences, using the Advanced Blast search from NCBI and 
selecting Mycobacterium as the genome, and/or the complete sequence of M. tuberculosis 
(Cole, S.T. etal (1998). Nature 393, 537-558), in order to identify virulence genes in other 
mycobacteria. In a most preferred embodiment, this method can be used to identify 
virulence genes of M. tuberculosis. For example, Example 9 shows that the thirteen Af. 
marinum virulence genes examined have functional homologues in M. tuberculosis. 



OOSJB3 




-BT^Z&f ** SSi it' vmr (n n^tu 



WO 01/19993 



PCT/US00/25512 



-16- 

Methods to clone such M tuberculosis homologues are routine in the art. See, e.g., 
Example 7. 

Defined mutations can be introduced into cloned, putative virulence genes of M 
tuberculosis genes by methods similar to those discussed above for mutagenizing cloned 
M. marinum genes. The mutations can be made in M. tuberculosis either before or after 
the corresponding mutations in M marinum have been characterized. Any of the types of 
mutations described above can be introduced into an M. tuberculosis gene, including 
knockouts of a large portion, including the entire coding sequence, of the gene. In order 
to facilitate the generation of mutants in M. tuberculosis, conventional, routine procedures 
can be used to identify those regions of theM tuberculosis gene which correspond to the 
site of mutation in the corresponding M. marinum gene. For example, corresponding 
active sites and/or functional domains can be identified by, eg., comparing the sequences 
or modeling the predicted protein structures. The mutated DNA can then be reintroduced 
into the M. tuberculosis genome by methods similar to those described above for 
reintroducing mutations into the M. marinum genome. Several such methods are described 
in Example 7. In a most preferred embodiment, the defined mutation is reintroduced into 
the M. tuberculosis genome by homologous recombination using a long linear 
recombination substrate. The phenotypic effect of an M. tuberculosis mutation can be 
determined routinely with one of several available animal models for this organism, 
including, e.g., the infection models with guinea pig (Collins, D.M. et al (1995). PNAS 
92, 8036-8040; B. Bloom, ed., (1994). Tuberculosis: Pathogenesis , Protection, and 
Control, ASM Press, Washington, D.C. Chapter 9); mouse and rabbit (B. Bloom, ed., ibid, 
Chapters 8 and 10, respectively); and monkey (Walsh et al (1996). Nature Medicine 2, 
430-436). 

The invention encompasses virulence genes (e.g., isolated virulence genes) as 
described elsewhere herein, from M. marinum and/or M. tuberculosis, which are identified 
by the methods of the invention, and/or variants (e.g., naturally- or non-naturally-occurring 
modifications, mutations, polymorphisms, etc.) or fragments thereof. By a "variant" of 
a gene or fragment is meant, as used herein, a replacement, deletion, insertion or other 
modification of the gene or fragment. It is preferred that the variant has at least about 70% 
sequence identity, more preferably at least about 85% sequence identity, most preferably 
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at least about 95% or 98% sequence identity with the gene or fragment. The degree of 
similarity can be determined using any of the methods disclosed herein. By a "fragment" 
of a gene is meant a single strand or double stranded nucleic acid {e.g., oligonucleotide) 
of a size smaller than that of the gene, obtained by any of a variety of conventional means, 
e.g., digestion with restriction enzymes, PCR amplification, synthesis with an 
oligonucleotide synthesizer, synthesis with a DNA or RNA polymerase, or the like. Such 
fragments can be used, for example, to diagnose the presence of a gene in a sample of 
interest, e.g., by serving as a hybridization probe or a PCR primer. Such diagnostic assays 
can be set up and performed by routine, conventional procedures in the art. In another 
embodiment, such fragments can be used to screen for virulent strains of bacteria, e.g., 
bacteria which comprise a polynucleotide that encodes a particular virulence gene or a 
fragment thereof. Of course, full-length virulence genes of the invention and variants 
thereof can also be used in diagnostic assays. 

The invention also encompasses polynucleotides which are complementary to a 
gene of the invention or fragment thereof, or which hybridize to such a gene or fragment 
under selected (e.g., high) stringency conditions. For example, the invention encompasses 
an oligonucleotide complementary to a portion of a virulence gene which can be used, e.g., 
as an antisense oligonucleotide to regulate expression of the gene, e.g., in a method of 
therapy. Methods to make and use antisense molecules of this type are conventional and 
routine, and are presented, e.g., in U.S. Pat. Nos. 5,876,931 and 5,585,479 and in 
references cited therein. Similarly, ribozymes comprising such fragments can be used in 
a method of treatment. Methods of making and using ribozymes are also conventional in 
the art. 

Of course, the genes and fragments discussed herein can be any form of 
polynucleotide or nucleic acid, e.g., naturally occurring, synthetic or intentionally 
manipulated polynucleotides, wherein nucleotide bases or modified bases are linked by 
various known linkages, e.g., ester, phosphodiester, sulfamate, sulfamide, 
phosphorothionate, phosphoroamidate, methyl phosphonate, carbamate, or other bonds, 
depending on the desired purpose, e.g., resistance to nucleases, such as RNAse H, 
improved in vivo stability, etc. Various modifications can be made to nucleic acids, such 
as attaching detectable markers (e.g., avidin, biotin, radioactive or fluorescent elements, 



imS83S & , 07120 

WO 01/19993 PCT/US00/25512 

-18- 

ligands), or moieties which improve hybridization, detection or stability. The 
polynucleotides can be DNA, cDNA, RNA, PNA, synthetic nucleic acid, modified nucleic 
acid, or mixtures thereof. Polynucleotides can be of any size, e.g., ranging from short 
oligonucleotides to large gene clusters or operons. Either or both strands of a double 
5 strand nucleic acid are included. 

The invention also encompasses peptides or polypeptides encoded by and/or 
expressed from M. marinum and/or M tuberculosis genes identified by the methods of the 
invention, and/or variants or fragments thereof, and products which are generated by such 
peptides or polypeptides. The term "genes identified by the methods of the invention" 
10 encompasses any gene in a given operon, a mutation in one of whose genes results in an 
avirulent phenotype {e.g., the gene can be a downstream gene whose expression is 
diminished or abolished because of an upstream polar mutation, or a gene whose gene 
product interacts with another gene product of the operon, etc.). 

The peptides or polypeptides can be isolated {e.g., purified) from bacteria directly, 
15 or they can be expressed recornbinantly and isolated {e.g., purified) from recombinant 
organisms. Methods of isolating, purifying and sequencing naturally produced or 
recornbinantly produced peptides and polypeptides are conventional and routine in the art. 
The genes can be cloned into any of a variety of expression vectors. The sequences to be 
expressed can be genomic sequences, e.g., subcloned sequences from a cosmid library as 
20 described in Example 6, or they can be corresponding cDNA sequences, obtained by 
conventional means. In some cases, it may be desirable to express a fragment of a gene, 
or more than one gene, e.g., as many as the genes of an entire operon. Vectors and 
appropriate regulatory elements for expressing genes in a variety of cell types or hosts, 
including prokaryotes, yeast, and mammalian, insect and plant cells, and methods of 
25 cloning and expressing genes or gene fragments, are routine in the art and are discussed, 
e.g., in U.S. Pat. Nos. 5,876,931, 5,700,683, 4,440,859, 4,530,901, 4,582,800, 4,677,063, 
4,678,751, 4,704,362, 4,710,463, 4,757,006, 4,766,075 and 4,810,648. 



30 



The invention also encompasses a host transformed to express a peptide or 
polypeptide of the invention, or a host which is mutated so the expression of a peptide or 
polypeptide of the invention is disrupted {e.g., inhibited), or progeny of such hosts. 
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"Variants" of the peptides or polypeptides are also included in the invention, e.g., 
insertions, deletions and substitutions, either conservative or non-conservative, where such 
changes do not substantially alter the normal function of the protein. By "conservative 
substitutions" is meant by combinations such as Gly, Ala; Val, He, Leu; Asp, Glu; Asn, 
5 Gin; Ser, Thr; Lys, Arg; and Phe, Tyr. Variants can include, e.g., homologs, muteins and 
mimetics. Many types of protein modifications, including post-translational 
modifications, are included. See, e.g., modifications disclosed in U.S. Pat. No. 5,935,835. 

"Fragments" of the peptides or polypeptides are also included in the invention. 
These fragments can be of any length. In a preferred embodiment, a fragment is functional 
10 (e.g., has biological activity, can inhibit or enhance the activity of a protein or other 
substance, contains one or more immunogenic epitopes, etc.). In a most preferred 
embodiment, the fragment contains all or a subset of the amino acids of SEQ ID NOs: 5, 
7, 9, 12, 1 5, 20, 22, 24, 26, 28, 30, 32-34, 36-38, 40 or 42. 

Among the polypeptides of particular interest are polyketide synthases. Example 
1 5 9, for example, shows that an M. marinum virulence gene identified by the method of the 
invention, and an M. tuberculosis homologue of it, appear to be polyketide synthase genes. 
As is well-known, many polyketides have therapeutic value (for human, veterinary, or 
aquaculture uses). For example, polyketides have been shown to function as antibiotics, 
chemotherapeutic agents or immunosuppressive agents, e.g., in transplant patients. The 
20 invention includes the generation and/or isolation (e.g., purification) of polyketide 
synthases encoded by virulence genes identified by the method of the invention, as well 
as polyketides produced by those synthases. The polyketides can be generated by 
recombinant means, isolated from non-recombinant bacteria, or produced synthetically. 
Methods for making, isolating and purifying polyketides are routine and well-known in 
25 the art. 

Recombinantly expressed polypeptides of the invention can also be used to confirm 
that a particular virulence gene is responsible, at least in part, for a pathogenic phenotype 
in an organism - that is, to confirm Koch's postulates. Example 8 shows how a 
recombinantly expressed M marinum putative virulence gene can be used to complement 
30 a mutant bacterium which is defective in that gene, and to restore a virulent phenotype in 
fish infected by the complemented mutant. 
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Virulence genes of the invention and peptides thereof can contain antigenic 
epitopes. The invention also encompasses antibodies, including polyclonal or monoclonal 
antibodies, or fragments of polyclonal or monoclonal antibodies, which are generated in 
response to such epitopes. Such antibodies can be used, e.g., in diagnostic assays to detect 
the presence of a mycobacterium, to identify virulent strains of bacteria, or in methods to 
treat disease conditions caused or exacerbated by a virulence protein (e.g., passive 
immunization), following routine, art-recognized procedures. 

The invention also encompasses an avirulent mycobacterium, preferably M 
marinum and/or M. tuberculosis, which harbors one or more mutation(s) in one or more 
virulence gene(s) identified by the methods of the invention, or a pharmaceutical 
composition which comprises such a bacterium and a pharmaceutical ly acceptable carrier. 

In a preferred embodiment, the avirulent bacterium is introduced into a host (e.g., 
a fish, cow or human) in order to elicit an immune response. Because the bacterium is 
avirulent (e.g., attenuated), it is expected to be suitable for administration to a host in need 
of treatment, but it is also expected to be antigenic and to give rise to an immune response, 
preferably a protective immune response. For such a use, it is preferred that the mutation 
is substantially non-revertable, e.g., a deletion or frame-shift mutation. To ensure non- 
revertability, it is preferable that a bacterium comprises at least two or three such 
mutations, preferably in different genes. A small deletion mutant would be expected to 
provide antigenic epitopes in the portion of the protein which lies downstream of the 
deletion, even though the protein, itself, is not functional with respect to virulence. 

Another embodiment of the invention is a vaccine comprising a suitable avirulent 
mycobacterium of the invention and a pharmaceutically acceptable carrier. By vaccine is 
meant an agent used to stimulate the immune system of a living-organism so that 
protection against future harm is provided. Immunization refers to the process of inducing 
an antibody and/or cellular immune response in which T-lymphocytes can either kill the 
pathogen and/or activate other cells (e.g., phagocytes) to do so in an organism, which is 
directed against a pathogen or antigen to which the organism has been previously exposed. 
The term "immune response," as used herein, encompasses, for example, mechanisms by 
which a multi-cellular organism produces antibodies against an antigenic material which 
invades the cells of the organism or the extra-cellular fluid of the organism. The antibody 



WO 01/19993 



PCT/US00/25512 



-21- 

so produced may belong to any of the immunological classes, such as immunoglobulins 
A,D,E,G or M. Other types of responses, for example cellular and humoral immunity, are 
also included. Immune response to antigens is well studied and widely reported. A survey 
of immunology is given e.g., in Roitt I., (1994). Essential Immunology, Blackwell 
Scientific Publications, London. Methods in immunology are routine and conventional 
(see, e.g., in Current Protocols in Immunology; Edited by John E. Coligan et al., John 
Wiley & Sons, Inc.). 

Methods of formulating, testing, optimizing and administering vaccines of the 
invention are routine and conventional, and are described, e.g., in U.S. Pat. Nos. 
5,876,931, 5.700,683, and references cited therein, and in "New Generation Vaccines, 
edited by M.M. Levine et al, 2nd edition, Marcel Dekker, Inc., New York, NY, 1997." 
Active immunization of a patient {e.g., human, fish, cow, etc.) is preferred. In this 
approach, one or more mutant bacteria are prepared in an immunogenic formulation 
containing suitable adjuvants and carriers and administered to the patient in known ways. 
Suitable adjuvants include Freund's complete or incomplete adjuvant, muramyl dipeptide, 
the "Iscoms" of EP 109 942, EP 180 564 and EP 231 039, aluminum hydroxide, saponin, 
DEAE-dextran, neutral oils (such as miglyol), vegetable oils (such as arachis oil), 
liposomes, Pluronic polyols or the Ribi adjuvant system (see, for example GB-A-2 189 
141). "Pluronic" is a Registered Trade Mark. The patient to be immunized is a patient 
requiring to be protected from the disease caused by, or exacerbated by, the virulent form 
of the bacterium. 

The aforementioned avirulent bacteria of the invention or a formulation thereof 
may be administered by any conventional method including oral and parenteral {e.g., 
subcutaneous or intramuscular) injection. The treatment may consist of a single dose or 
a plurality of doses over a period of time. While it is possible for an avirulent bacterium 
of the invention to be administered alone, it is preferable to present it as a pharamaceutical 
formulation, together with one or more acceptable carriers. The carriers) must be 
"acceptable" in the sense of being compatible with the avirulent microorganism of the 
invention and not deleterious to the recipients thereof. Typically, the carriers will be water 
or saline which will be sterile and pyrogen free. 
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It will be appreciated that a vaccine of the invention, depending on its bacterial 
component, may be useful in the fields of human medicine, veterinary medicine, or 
aquaculture. A vaccine for fish against Mycobacterium marinum could be of particularly 
significant economic importance. Mycobacterium marinum causes tuberculosis in more 
than 1 50 species of both salt-water and fresh-water fish, among them salmonid trout 
{salmo gairdneri, salmo trutta, oncorhynchos mykiss), striped bass, tilapia, etc. 
Aquaculture facilities infected with M. marinum suffer from a constant mortality rate over 
a long period of time accompanied by severe economic losses, which could be ameliorated 
with such a vaccine. A vaccine against M. tuberculosis could, of course, be a significant 
weapon in the battle against tuberculosis, which is wide-spread in human populations. 

Vaccines encompassed by the invention also include killed bacterial vaccines; 
subunit vaccines comprising a virulence protein(s) of the invention (e.g., a wild type or 
mutant protein(s), or a variant(s) thereof), or an antigenic fragment(s) thereof; bacteria 
which produce or are capable of producing such virulence proteins or fragments; and DNA 
vaccines comprising a nucleic acid which encodes such a virulence protein or fragment 
thereof. Methods of making and using such vaccines are routine and conventional in the 
art. For methods of making and using DNA vaccines, see, e.g., U.S. Pat. No. 5,589,466. 

An avirulent bacterium of the invention can also be used as a "carrier" for the 
expression of one or more cloned heterologous gene(s) or fragments thereof. For example, 
an avirulent M: marinum organism can be used to express a secreted or surface-expressed 
heterologous peptide or polypeptide in fish, and an avirulent M. tuberculosis organism can 
be so used in humans. The avirulent bacterium can be used to express, e.g., an allergen, 
or an antigenic epitope from another pathogen, for which the modified bacterium can act 
as a vaccine. In a preferred embodiment, the heterologous gene is inserted at or near the 
position at which the transposon was inserted in an avirulent mutant, or at or near the site 
of the more "well-defined" avirulent mutation. Methods to clone heterologous genes are 
routine, as are methods to express them in a host. Methods of making and using such 
carriers are disclosed, e.g., in U.S. Pat. Nos. 5,876,931 and 5,424,065. 

The invention also encompasses a method for identifying an agent which reduces 
the ability of a microorganism to survive in a host, e.g., an anti-mycobacterial agent which 
inhibits expression of a virulence gene, or which attacks products produced directly or 
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indirectly by a virulence gene. In a preferred embodiment, such an agent can be used to 
treat a disease caused by, or exacerbated by, a virulence gene of the invention. One such 
method, as disclosed, e.g., in U.S. Pat. No. 5,876,931, is to generate a bacterium which 
over-expresses the virulence gene, and then to identify an agent which reduces the viability 
5 or growth of a wild type cell but not the cell overexpressing the gene, in a host. Methods 
to generate the over-expressing strain, and to perform such screening procedures, are 
routine and are described, e.g., in U.S. Pat. No. 5,876,931. Other methods to screen for 
anti-mycobacterial drugs are routine and are described, e.g., in U.S. Pat. No. 5,700,683. 

The invention also relates to a method of screening vaccine candidates for human 
10 tuberculosis in the fish model. In one embodiment, based on the assumption that M 
marinum bacteria may be suitable for human vaccines, goldfish can be inoculated with an 
M. marinum vaccine candidate of interest. The fish are then challenged with fully virulent 
M. marinum at a dose capable of establishing disease. A vaccine which, when inoculated 
into a fish, protects the fish from subsequent virulent challenge by the fish failing to 
1 5 develop disease symptoms is a candidate for a human vaccine. In another embodiment, 
a putative virulence gene of M tuberculosis is selected, and a mutation is made in the M. 
marinum homologue of that gene. The mutant M. marinum is then tested as a vaccine 
candidate, using the goldfish model as above. 

Brief Description of the Figures 

20 Fig. 1 shows the median survival time (MST) offish inoculated with M. marinum. 

The median survival time of fish (days) inoculated with M marinum at doses indicated per 
fish is compared to a phosphate buffered saline (PBS) control. *survival to endpoint of 
experiment, 56 days. 

Fig. 2 shows a comparison of the growth of M marinum in liver, spleen and 
25 kidney. The inoculum is 10 7 CFU/fish. Results are given as geometric means ± standard 
error for eight fish per time point. 

Fig. 3 shows a comparison of mean cumulative granuloma scores (MCGs) over 
time offish infected with 10 7 CFU of M. marinum organisms. The results are given as a 
vertical box plot, with horizontal lines marking the median 10 th , 25 lh , 50 th , 75 ,h and 95 th 
30 percentile points of GSs for eight animals at each time point. The mean of each group is 
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represented by a thick line. At 2 weeks, the median 50 lh percentile and mean values are 
the same. 

Fig. 4 shows a survival curve of goldfish inoculated with 10 8 CFU of M. marinum 
1 2 1 8R (wild type) or 1 2 1 8S (mutant). 

5 Fig. 5 shows the modification of pYUB285 with transposon tags. Bg is BgllT; 

Bam is BamHl; H is Hindlll; IR are inverted repeats which mark the boundaries of the 
transposon; ORFR and ORFA are transposon genes; aph is the gene for kanomycin 
resistance; oriE is the E. coli ori; and AoriM is the disabled mycobacterial ori. 

Fig. 6 shows the construction of an M marinum signature-tagged mutant library. 

10 Fig. 7 shows a schematic diagram of an M. marinum mutant library screen in the 

goldfish model. 

Fig. 8 shows a survival curve of M, marinum mutant 41.2. 

Fig. 9 shows a survival curve of M. marinum mutant 80.1 . 

Fig. 10 shows a survival curve of M. marinum mutant 86.1. 

15 Figs. 11 A and B illustrate ligation-mediated PCR. 

Fig. 12 shows Competitive Indices of M. marinum mutants 32.2, 60.2, 62.2, 67.1, 
80.1, 86.1, 42.2, 80.8 and 68.6. 

Fig. 13 shows a survival curve of M. marinum mutant 67. 1 . 

Fig. 14 shows a survival curve of M marinum mutant 39.2. 

20 Fig. 15 shows a survival curve of M. marinum mutant 42.2. 

Examples 

Example 1. Properties of the M. marinum/ goldfish model 



A. Median Survival Time and LD 50 . 
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To determine the median survival time of goldfish after inoculation with M. 
marinum strain ATCC 927, groups of 20 to 32 fish were inoculated intraperitoneal ly with 
10 9 , 10 s , or 10 7 colony forming units (CFU). The median survival time of goldfish 
inoculated with M marinum was dose dependent, with survival time decreasing with 
increasing doses of bacteria. The median survival time of fish was 4, 10, and >56 days 
(the endpoint of the experiment) with inocula of 10 9 , 10 8 , or 10 7 M marinum organisms, 
respectively. All fish inoculated with 10 7 CFU or less survived to the end point of the 
experiment (56 days). The control fish group, inoculated with PBS in 5 separate 
experiments, had a total of two premature deaths, one at 8 and one at 19 days post- 
inoculation, from a total of 55 fish. The remainder of the control fish survived to 56 days, 
the endpoint of the experiment (See Figure 1). The LD 50 at 1 week postinfection with M. 
marinum was 4.5 x 10 8 (calculated by the method of Reed & Muench, 1938. Am. J. Hyg. 
22, 493-497). 

B. Mycobacterial recovery from fish organs. 

To assess the ability of M. marinum to persist in goldfish tissue, the liver, spleen, 
and kidneys from each sacrificed fish were collected for bacteriological examination. M, 
marinum was recovered from all organs of fish in the 10 9 or 10 8 CFU inoculum groups. 
In fish inoculated with 1 0 7 CFU, M. marinum was recovered from 96% of the examined 
organs. 

The fate over an 8 week period of the M. marinum ATCC 927 strain in the livers, 
spleens, and kidneys offish inoculated with 10 7 CFU was followed. (See Figure 2). There 
was a significant positive linear relationship between time postinoculation and colony 
recovery in the liver (P <0.001); for the spleen and kidneys, the relationship was positive 
but did not reach statistical significance (P = 0.054 and P = 0.091 , respectively). Between 
8 and 16 weeks postinoculation, M. marinum persisted in the tissue with no significant 
change in the colony counts. In addition, in the 10 2 to 10 6 CFU inoculum groups, M 
marinum was isolated from at least one organ from all infected fish. 
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C. An acute and chronic form of mycobacterial infection. 

The pathology of infected fish was dependent on the inoculum dose and the time 
postinfection of animal sacrifice. Fish infected with either 10 9 or 10 8 CFU of M. marinum 

organisms suffered from anorexia, sluggish movement, and loss of equilibrium. 

The histopathology of fish infected with 10 9 and 10 8 CFU was characterized by 
severe peritonitis and necrosis as compared to control fish. The peritoneum was filled 
with inflammatory cells consisting of lymphocytes, macrophages, fibrous connective cells 
as well as with degenerating cells and bacteria. The mean cumulative granuloma score 
(MCGS) for these 2 groups was similar (0.2 for the 10 9 CFU group and 0.9 for the 10 8 
CFU group). In the 10 8 CFU inoculum group, granuloma formation was more likely to 
be found in animals which survived more than 2 weeks postinoculation. 

When examined at 2 weeks, 6 of 8 fish in the 10 7 CFU group had moderate to 
severe peritonitis. Unlike the 10 8 and 10 9 CFU inoculum groups which succumbed to 
infection, the 10 7 CFU inoculum group survived the infection, and by 4 to 6 weeks 
postinoculation, the acute peritoneal inflammation was replaced by a chronic inflammatory 
state. Fish inoculated with 10 7 CFU demonstrated granuloma formation in all organs 
evaluated (MCGS of 5.0), including the peritoneum and pancreas, liver (e.g., onion ring 
granuloma composed of epithelioid macrophages surrounding a necrotic center), spleen, 
trunk kidney, head kidney, heart and intestine. Pleomorphic granulomas (necrotizing, non- 
necrotizing and caseous) were seen. The necrotizing granulomas were characterized by 
a central area of necrosis surrounded by macrophages, epithelioid cells, and thin fibrous 
connective tissue. Frequently, caseous necrosis was present in the central area of the 
granuloma. Granulomas containing foamy macrophages were also seen. Occasionally, 
Langhans and foreign body type giant cells were observed. In addition, acid fast bacilli 
could be demonstrated with the modified Ziehl-Neelsen stain. Melanomacrophage centers 
were seen in a few cases. 



The chronic inflammatory response of fish towards M. marinum was time 
dependent, as seen by the increment in mean cumulative granuloma scores (MCGSs) with 
time in animals inoculated with 10 7 CFU (See Figure 3) up to 8 weeks. From 8 to 16 
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weeks postinoculation, there was no significant change in MCGSs (5.0 and 5.7 
respectively). 

D. Minimum infectious dose (MID). 

To estimate the lowest possible dose of M. marinum able to establish infection in 
goldfish, groups of four fish were inoculated with M marinum ATCC 927 at doses of 10 6 , 
10 5 , 10 4 3 and 10 2 CFU. Granuloma formation was seen in 25% of the goldfish by 4 weeks 
and in 88% by 8 weeks postinfection with a dose of 6.3 x 10 2 CFU or higher (Table 1). 
The minimum number of organisms required to establish infection in goldfish appears to 
be approximately 600 CFU. 



10 Table 1. MID of M. marinum ATCC 927 



Inoculum (CFU/flsh) 


No. positive 3 




MCGS 




4VVk 


8Wk 




1.2 x 10 6 


1 12 


1/2 


5.0 


3.0 x 10 s 


0/2 


2/2 


5.5 


2.4 x 10" 


1 12 


2/2 


1.5 


6.3 x 10 2 


0/2 


2/2 


4.5 



a Number of granuloma-positive animals per total number of animals 
at 4 and 8 weeks postinoculation. 



Mycobacterial virulence assay. 

The relative virulence of different strains of M marinum^ isolated from both human 
and animal origin, was assessed. Three mycobacterial strains, M. marinum ATCC 927, 
15 M and F-l 10, were inoculated into goldfish at 10 8 CFU. The median survival times of M. 
marinum M, ATCC 927, and F-l 10 were similar, ranging from 4 to 10 days. 
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Examnle 2 - Differentiation of an avirulent M m a rinum mutant from the wilH typs in the 

goldfish model 

The goldfish model can differentiate between virulent and avirulent M. marinum 
organisms. A comparison of such a pair of strains is shown in Figure 4. The M. marinum 
5 strains designated 121 8R (wild type, aka ATCC 927) and 121 8S (avirulent mutant) were 
inoculated into groups of 5 to 9 goldfish in two separate experiments at an inoculum dose 
of 1 .4 to 4 x 10 8 CFU. The median survival time of goldfish inoculated with M marinum 
1218R organisms was 3 days compared to 28 days (endpoint of experiment) with M. 
marinum 1218S organisms (See Figure 4). The mutant 1218S also failed to persist in the 
1 0 mouse macrophage model. This experiment shows that the fish mycobacteriosis model 
can allow the identification of M. marinum virulence genes. 



Example 3 - Signature-tagged mutagenesis, and the g enera tion of a library 

A, Construction of a master bank of signature-tagged transposons 

As an initial step in creating a bank of signature-tagged transposons, plasmid 
15 pAT30 is generated (see Figure 5). A unique restriction site (Bgtll) is introduced into the 
mycobacterial transposon delivery vector pYUB285 between ORFA and aph. The vector 
is a suicide vector in mycobacteria because of inactivation of the mycobacterial origin of 
replication by an internal deletion. A kanamycin resistance gene {aph) inserted into 
IS 1 096 allows for a library of insertions in the mycobacterial genome to be generated upon 
20 electroporation of the plasmid followed by selection for kanamycin. 

To generate a collection of signature tagged transposons to be inserted into pAT30, 
primers P5 (5'-CTAGGTACCTACAACCTC-3') (SEQ ID NO: 1) and P3 (5'- 
CATGGTACCCATTCTAAC-3') (SEQ ID NO: 2) and the template RT1 oligonucleotide 
(5 ' - CTAGGTACCTACAACCTCAAGCTT-[NK] 20 
25 AAGCTTGGTTAGAATGGGTACCATG-3') (SEQ ID NO: 3) are prepared by 
conventional, routine methods, preferably using a commercially available oligonucleotide 
synthesizer. The 5' ends of primers P5 and P3 have BamHl sites. The template RT1 
oligonucleotide is similar to that designed by Hensel et aL, with a variable central region 
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(NK) 20 flanked by arms of invariant sequences. The invariant arms allow the sequence 
tags to be amplified in a PCR with the use of primers P3 and P5. The variable region is 
designed to ensure that the same sequence occurs only about once in 2 x 10 17 molecules. 
PCR is performed, using standard, routine methods (see, e.g., Innis, M.A. et al, eds. PCR 
5 Protocols: a guide to methods and applications, 1990, Academic Press, San Diego, CA) 
to generate and amplify double stranded, 90 bp signature tags. The PCR amplified tags are 
digested with BamHl, gel purified, and then ligated to the BgUl digested, 
dephosphorylated (calf intestinal phosphatase, New England BioLabs, Inc.) pAT30 
plasmid. E. coli DH5oc is transformed with this ligation mixture and plasmids from 800 

10 individual clones are isolated, arrayed in 96 well microtiter plates, and transferred to nylon 
membranes. These plasmids are analyzed for hybridization and tag amplification 
efficiency. In this example, ninety-six plasmids that are hybridization and amplification 
efficient are chosen for the master plasmid collection. The master plasmids are screened 
for cross hybridization with other plasmids in the master plasmid collection and any cross- 

1 5 hybridizing plasmids are eliminated until the collection has no cross hybridizing members. 
Of course, a master plasmid collection of any size can be constructed by this method. 
Methods for carrying out STM mutagenesis and isolating bacterial virulence mutants are 
described, e.g., in Hensel et al (1995). Science 262, 400-403 and U.S. Pat. No. 5,876,931. 

B. Optimization and initial characterization of M. marinum transposition 

20 Several protocols for the preparation of competent cells from M. marinum are 

evaluated. The strains tested are ATCC 927 (fish isolate) and M. marinum strain M 
(human isolate). Electrocompetent cells are prepared from M. marinum cells grown to 
different growth phases at different temperatures in the presence of ethionamide or 
cycloheximide. Mycobacterial cells are transformed by electroporation with the 

25 replicative Escherichia coli- mycobacteria shuttle vector, pYUB18 (Jacobs, W.R. et al 
(1991). Methods Enzymol 2QA, 537-555), as well as the suicide vectors p\TJB285 
(McAdam R.A. et al (1995). Infect. Immun. 61, 1004-1012) and pUS252, carrying the 
transposable elements, IS1096 and IS6770, respectively (Dale, J.W. (1995). Eur. Respir. 
J. &, 633s-648s). Mutants of M. marinum are recovered on 7H10 agar plates supplemented 

30 with kanamycin. Transformation and transposition efficiencies under different protocols 
are compared, using routine, art-recognized procedures. See, e.g., McAdam et al (1995). 
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Infec. Immun. 61, 1004-1012 and Cirillo, J. D. et al (1991). J. BacterioL 122, 7772-7780. 
Southern hybridization analysis is performed on mycobacterial mutants to confirm the 
transposition events. These analyses show that: 1) competent cells prepared at room 
temperature from late-exponential growth phase organisms yield a higher transposition 
efficiency than cells prepared at 4°C or from early-or mid-exponential growth phase 
organisms; 2) the highest efficiency for transposition is 10 2 -10 3 cfu per jig of plasmid 
DNA; and 3) the IS/0P6-derived transposon is best able to efficiently mutagenize M 
marinum. 

To confirm that M. /warmw/w-kanamycin resistant colonies are not spontaneous 
mutants, colonies recovered after electroporation with the non-integrating, replicative 
vector, pYUB18, are analyzed; the plasmid pYUB18 is successfully isolated from 6 
separate transformants and is identified by restriction enzyme mapping. This indicates that 
the transformants are not spontaneous mutants. In another experiment, 35 randomly 
selected mutants recovered from electroporation of the suicide vector, pYUB285 are 
examined by Southern analysis to determine whether transposition is random in the M. 
marinum chromosome. All tested transposon mutants yield a single band, located in a 
different position on the Southern blot, consistent with random integration of a single copy 
of IS! 096 into the M marinum genome. Evaluation of 10 mutants obtained in a single 
electroporation experiment shows that each mutant is inserted into a different part of the 
M. marinum genome, indicating that the mutants from a given electroporation do not 
represent siblings. 

C. Generation of an M. marinum mutant library 

AnM marinum mutant library is generated by electroporating individual members 
of the 96 master plasmid collection into M. marinum bacteria (See Figure 6). M. marinum 
electrocompetent cells are prepared from a 100 ml culture grown to late exponential phase 
(O.D. 600 =1.6 to 1.8). Bacteria are washed three times at room temperature with 10% 
glycerol and then suspended in 1 ml 10% glycerol and distributed to 0.2 cm gap 
electroporation cuvettes (Bio-Rad Laboratories). Electroporation is performed at room 
temperature using a Gene Pulser (Bio-Rad Laboratories) with parameters of 2.5 kV, 25 jiF, 
and 800 Q. Electroporated cells are rescued by growth overnight in 7H9 broth with 1 0% 
albumin-dextrose complex enrichment (ADC) (52) at 30°C and plated on 7H10 agar with 
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kanamycin (20jig/ml) and incubated at 30°C. Mutants appear 1 to 2 weeks after plating. 
Mutants from each electroporation are named for the master plasmid used for transposon 
delivery (pAT30-l plasmid yields mutants 1.1, 1.2, etc.). In this example, 960 mutants are 
isolated, 10 mutants per master plasmid. Of course, more mutants can be isolated per each 
master plasmid, and the 96 (or additional) master plasmids can be used to generate 
additional mutants. 



Example 4 - Screening an M marinum library for potential avirulent mutants, using the 
goldfish model 

A. Screening for mutants which show reduced viability in the goldfish host 

10 The M. marinum library obtained in Example 3 is screened for mutants which 

exhibit a reduced ability to survive in the goldfish model. The library of M marinum 
transposon-tagged mutants is screened in pools; in this example, each pool has 48 mutants 
(See Figure 7). Each of the mutants in a given pool is marked with a unique DNA tag (i.e. 
they are derived from 48 of the 96 master plasmids). To generate an input pool, mutants 

15 that make up the pool are grown in individual wells of a 96-well microtiter plate 
containing 7H9 broth with ADC and kanamycin (20^ig/ml) at 30°C until they reach 
0-0.600= 0.6-0.8. The mutants are then pooled and an aliquot is removed for amplification 
using colony PCR (input pool probe). The remaining pooled bacterial cells are 
centrifuged, resuspended in phosphate buffered saline (PBS) to an inoculum dose of about 

20 2xl0 7 cfu/ml, sonicated for 3 minutes, and injected into three fish. The fish are sacrificed 
at 7 days postinoculation and spleen, liver and kidney are harvested. The mutants that 
have reached and multiplied within these organs are recovered by plating homogenates of 
the organs onto laboratory medium. The recovered mutants from a given organ are 
combined and an aliquot is used for amplification using colony PCR (output pool probe). 

25 The products of the input and output pool amplification are used in a second PCR 
amplification using a- 32 P dCTP to generate two radiolabeled probes. The amplified probes 
consist of a central variable region (the unique DNA tag) flanked by arms of invariable 
sequences which permit amplification of any tag using a defined set of primers. The arms 
are released by digestion with Hind HI and the radiolabeled tags are used to probe replicate 




WO 01/19993 



PCT/US00/25512 



-32- 



10 



15 



20 



25 



membranes from the master plasmid collection. Because of the complex structure of the 
mycobacterial cell wall and difficulties encountered in mycobacterial colony hybridization, 
in this example the amplified tags are used as probes to a dot blot containing the master 
plasmid collection. Hybridization to other forms of the master plasmid collection can, of 
course, be used. Tags from mutants that hybridize to the probe from the input pool (Figure 
7, membrane 1) but not to the probe from the output pool (Figure 7, membrane 2) 
represent mutants which are unable to survive or compete in the fish model. Such mutants 
are designated as potential virulence mutants. 

The pools of mutants recovered from different organs are kept separate, in order 
to characterize virulence mutants with regard to the organs examined. In some cases, 
mutations necessary for survival at different points in the pathogenesis of this organism 
can be identified, since the mechanisms necessary for survival in liver, spleen and kidney, 
or in other organs, may differ. The pools of mutants recovered form different fish are also 
kept separate. Mutants from two fish are used independently to produce an output pool 
probe and are independently hybridized to replica membranes to confirm reproducible 
identification of potential virulence mutants from a given experiment. 

B. Confirming that the mutants are avirulent by examining individual mutants in 
the goldfish model. 

M. marinum transposon mutants that reproducibly hybridize to the input pool probe 
but not to the output pool probe are examined individually in the goldfish model. An 
inoculum dose of 10 8 bacteria in 0.5 ml per fish is used to inoculate 3 fish per mutant. A 
control group of fish is simultaneously inoculated withM marinum ATCC 927 (wild type) 
at the same dose as the mutants and with PBS as a negative control. The median survival 
time (MST) of goldfish inoculated with the wild type at this dose is 1 0 days. If the MST 
for a given mutant is greater than that of the wild type, this confirms that the mutant may 
have the transposon inserted into a virulence gene. When a mutant-inoculated fish 
survives for 35 days, it is sacrificed and examined for histopathology; and portions of the 
liver, spleen and kidney are homogenized and plated for colony counts. These mutants are 
then inoculated into fish to determine the LD 50 . Three fish per mutant per dose are injected 
with 10 8 , 5 x 10 7 , or 10 7 CFU bacteria. The LD 50 for each mutant is evaluated at 1 week 
postinoculation and calculated by the method of Reed and Meunch (1938. Am. J. Hyg. 21, 
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493-497). The LD 50 at 1 week for the wild type strain is 4.5 x 10 8 CFU bacteria per fish. 
The LD 50 , Competitive Index, and/or pathology for each mutant is compared to that of the 
wild type strain. 

Competitive index: The competitive index may be used as a measure of the 
5 attenuation of a mutant with respect to a wild type strain. Mutant and wild type strains are 
mixed together in the inoculum. Animals are inoculated with the mixture and 2 weeks 
post-inoculation the animals are sacrificed. The liver of the animal is removed, 
homogenized, and the colony counts in the tissue are determined for both the mutant and 
wild type strains. The two strains are distinguished because the mutant is kanamycin 
10 resistant while the wild type is kanamycin sensitive. Mathematically, the competitive 
index is defined as the output ratio of mutant to wild type bacteria, divided by the input 
ratio of mutant to wild type bacteria. A mutant which has full virulence with respect to 
the wild type should not be out competed by the wild type and the competitive index 
should be 1.0. 

1 5 Histopathology examinations: Portions of the liver, spleen and kidney along with 

peritoneum, heart, pancreas, or other organs evident to one of skill in the art, are fixed in 
10% neutral buffered formalin for routine embedding in paraffin. Five ^m thin sections 
of the paraffin fixed tissues are prepared with a rotary microtome (American Optical, 
Buffalo, NY). After dewaxing, the sections are stained for acid fast bacilli with modified 

20 basic fuchsin stain and counterstained with methylene blue or stained with hematoxylin 
and eosin. 

Colony counts in organ homogenates or the ability to induce granuloma formation: 
These parameters can identify virulence defects which are more subtle than one which 
causes the MST to change. Mutants identified in the screening protocol as failing to 

25 survive in vivo, but which fail to cause a significant change from wild type in MST when 
inoculated individually in fish, are further examined. For these experiments, an inoculum 
dose of 10 7 CFU organisms are used, and animals are sacrificed at 4 and 8 weeks 
postinoculation. The liver, spleen, kidney, and/or other organs which are evident to one 
of skill in the art are harvested; one portion is homogenized for analysis of colony counts 

30 and another portion for histopathology. 
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Exa mple 5 - Sequencing and characterizing regions flanking t he transp osons in the 
virulence mutants 

Individual mutants confirmed in the goldfish model to be virulence mutants are 
examined by sequencing the nucleic acid flanking the site of insertion of the transposon. 
The sequence analysis can, of course, be performed before, simultaneously with, or after, 
a virulence defect has been confirmed. 

A. Direct sequencing of flanking regions 

In a most preferred embodiment, chromosomal DNA is isolated from each mutant 
and cut with a restriction enzyme that cuts once within the transposon (in this example, 
with BamHl). Linkers bearing a predefined PCR primer site, designed and generated 
using routine, art-recognized methods, are ligated to the BamHl -cut ends; and PCR 
fragments are amplified, using as primers a first outward primer sequence specific for a 
portion of the transposon, and a second inward primer specific for the PCR primer site in 
the appended linker, to generate an "amplified PCR fragment". In this example, a 
transposon-specific primer sequence is chosen based on the sequence of the inserted 
transposon, IS 1096, By "specific for," as used herein, is meant that a primer {e.g., the first 
outward primer) is sufficiently complementary to a target {e.g., the transposon) to bind to 
it (hybridize; serve as a PCR primer) under selected high stringent conditions, but not to 
bind to other, unintended, nucleic acids. Southern analysis, in which the membrane to 
which the DNA has been transferred is probed with an cc- 32 P labeled aph (kanamycin 
resistance) gene, can be used to identify the size of the "amplified PCR fragment" from 
each mutant. For example, mutants 41.2, 80.1 and 86.1 shown in Example 9 have unique 
amplified PCR fragments, of 550, 200 and 600 bp, respectively. The amplified PCR 
fragments are sequenced directly, using as primers one or both of the primers used to 
generate them, or are cloned into a vector such as pGEM and sequenced using primers 
corresponding to vector sequences. Methods for probing gels and sequencing DNA are 
routine and conventional in the art. 



30 



In another embodiment, the chromosomal DNA is cut with an enzyme which does 
not cut within the transposon. A variety of enzymes can be tested until one which 
generates a DNA fragment of an appropriate size is identified. Here, Kpn I is used. The 
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DNA is then ligated to create circular species and amplified by PCR using outward-facing 
primers complementary to the two ends of the transposon. In this way, the sequences 
which flank the insertion are amplified. These fragments are directly sequenced, using the 
same primers used to amplify the sequence. 

B. Cloning and then sequencing flanking regions 

In another embodiment, the gene sequences interrupted by a transposon are cloned 
first and then sequenced. Procedures for the analysis of DNA, including isolating DNA, 
cloning it, manipulating it, and sequencing it, are routine and well-known in the art. In a 
preferred embodiment, genomic DNA is extracted from each virulence mutant, and is 
digested with one or more restriction enzymes (e.g., in this example, Kpnl or BamHl) that 
provide genomic fragments of an appropriate size for cloning. The digested DNA is 
cloned into an appropriate plasmid, e.g., Bluescript II KS (Promega), or a low-copy 
plasmid such as pACYC184, in E. coli DH5cc, by using an appropriate positive selection 
marker (e.g. , kanamycin resistance). Kpnl does not cut within the transposon, so digestion 
with Kpn I, followed by selection with kanamycin, results in cloning of the transposon 
along with flanking DNA. Bam HI cuts once within the transposon, so digestion with Bam 
HI, followed by selection with kanamycin, results in cloning of part of the transposon 
along with flanking DNA on one side of the transposon. Once cloned, the gene sequence 
interrupted (disrupted) by the transposon is determined by using outward primers based 
on the sequence of the transposon insertion sequence, in this example, IS 1096 (See, e.g., 
McAdam et al (1995). Infec. Immun. 61, 1004-1012). 

C. Comparison of flanking sequences to known databases 

DNA sequences flanking each transposon (localized on one or on both sides of the 
site of transposon insertion) are compared with the use of the BLAST programs provided 
in the National Center for Biotechnology Information (NCBI) data base. 

In order to identify M. tuberculosis homologues of M. marinum virulence genes, 
the flanking sequences are also compared to the Mycobacterium database, using the 
advanced Blast search program, as above. 
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A discussion of functional homologues and related virulence genes from M. 
tuberculosis which have been identified for 3 M. marinum mutants is presented in 
Example 9. 



Example 6 - Isolating and characterizing wild type M. marinum genes which correspond 
5 tQ the genes disrupted by transposons in avirulent M. Marinum mutants 

Probes based on flanking M. marinum DNA sequences, characterized, e.g., as in 
Example 5, are generated and used to screen an M marinum cosmid library (The 
construction of such a cosmid library is described below). For example, part or all of the 
"amplified PCR fragment" which is described in Example 5 is labeled and used as a 

10 hybridization probe. Conditions for specifically hybridizing a probe to a target nucleic 
acid (e.g., cosmid DNA) can be determined routinely by known methods in the art (see, 
e.g., Nucleic Acid Hybridization, a Practical Approach, B.D. Hames and SJ. Higgins, 
eds., IRL Press, Washington, 1985). It is preferred that hybridization probing is done 
under selected high stringent conditions to ensure that the gene, and not a relative, is 

15 obtained. Of course, conditions of any stringency can be employed. By "high stringent" 
is meant that the gene hybridizes to the probe (e.g., when the gene is immobilized on a 
filter) and the probe (which in this case is preferably about >200 nucleotides in length) is, 
e.g., in solution, and the immobilized gene/hybridized probe is washed in 0.1 X SSC at 65° 
C. for 10 minutes. SSC is 0.15M NaCl/0.015M Na citrate. In general, "high stringent 

20 hybridization conditions" are used which allow hybridization only if there are about 10% 
or fewer base pair mismatches. As used herein, "high stringent hybridization conditions" 
means any conditions in which hybridization will occur when there is at least 95%, 
preferably about 97 to 100%, nucleotide complementarity (identity) between the nucleic 
acids. The corresponding cosmid is identified; and individual virulence genes are 

25 subcloned from the cosmid clone, using routine, conventional procedures in the art. The 
complete gene sequence is determined by routine, conventional methods. 

Construction of an M. marinu m cosmid library : An M. marinum genomic library 
in an E. coli - Mycobacteria shuttle cosmid (pYUB18) is constructed, using, e.g., methods 
disclosed in Jacobs, W.R. et al (1991). "Genetic Systems for Mycobacteria," in Methods. 
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EnzymoL 2Q£, 537-555. The pYUB18 vector has a unique BamHl site that can serve as 
the site of insertion of partial &/w3A-digested chromosomal DNA. Following in vitro 
packaging, the constructed libraries are transduced into cosmid in vivo packaging strains 
to permit amplification and efficient repackaging of recombinant cosmids into 
bacteriophage A. heads thus allowing for storage of the libraries as phage lysates. 



Exa mpl e 7 - Isolating and characterizing M. tuberculosis genes which correspond to M. 
marinum virulence genes 

In order to identify an M. tuberculosis gene which corresponds to a particular M 
marinum gene, an "amplified PCR fragment" from the M. marinum gene, such as that 

1 0 described in Example 5 or a fragment thereof, can be used to probe a cosmid library of M. 
tuberculosis. Most preferably, a probe based on the corresponding M. tuberculosis 
sequence, itself, is used. An M. tuberculosis cosmid library is constructed by routine 
methods. Hybridization is performed as described, e.g., in Example 6. Positive cosmid 
clones are identified and the hybridizing sequences subcloned and sequenced, using 

15 routine, conventional, methods in the art. 

Well-defined mutations can be introduced into a cloned M. tuberculosis gene, 
using the methods described herein for generating site-specific mutations in M. marinum 
genes. The mutations can then be introduced into the M. tuberculosis genome by 
homologous recombination. In a most preferred embodiment (as disclosed, e.g., in 

20 Balasubramanian, V. et al (1996). J. Bacteriol. 178 , 273-279, and Reyrat, J. et al ( 1 995). 
PNAS 22, 8768*8772), the recombination is performed with long linear recombination 
substrates containing the mutated gene (virulence gene::^pA) on a DNA fragment (>40 kb). 
This fragment is electroporated into the H37Rv strain of M. tuberculosis selecting for 
kanamycin resistance. Chromosomal DNA from the parent H37Rv strain and the 

25 kanamycin-resistant transformants are digested with Kpnl and probed with a Kpnl 
fragment containing the virulence gene::aph fragment. The strains containing the 
disrupted allele show a signal from a fragment which is 1 .3-kb greater (aph gene) than the 
hybridizing fragment from the wild type gene clone (control). These mutant strains can 
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be tested, e.g., in the guinea pig infection model (See, e.g., Collins, D.M. et al (1995). 
PNAS 92, 8036-8040). 

Alternatively, allelic exchange can be performed using ts-sacB vectors (see, e.g., 
Pelicic et al. (1997). PNAS 94, 10955-10960). The virulence gene::aph construct is 
inserted into pJMlO, a ts-sacB E. coli - Mycobacteria vector containing the kanamycin 
resistance gene for selection. The plasmid is introduced into the H37Rv strain of M 
tuberculosis by electroporation with selection initially at 32°C on 7H10-kanamycin. 
Transformants are selected, grown in liquid culture, and then plated at 39 °C on 7H10- 
kanamycin + 2% sucrose plates. Transformants obtained on the counterselective plates 
represent allelic exchange mutants. 

Example 8 - Complementation assays 

A candidate virulence gene is reintroduced into a transposon mutant on a low copy 
number E. coli - mycobacteria shuttle vector (pYUB213Akm) (Ramakrishnan, L. et al 
(1997). J. Bacteriol. 179 , 5862-5868) to determine whether the cloned gene complements 
the virulence defect in the goldfish model. This plasmid is a derivative of pMV262 
(Stover, C.K. et al (1991). Nature 151, 456-460) with a bleomycin resistance gene for 
selection. Bacteria are recovered from those fish in which the virulence defect has been 
complemented, and analyzed for bleomycin and kanamycin resistance to confirm that the 
complementing plasmid is present. 

Some cloned virulence gene candidates may fail to complement the virulence 
defect in the fish model because of, e.g., instability of the cosmid clone, polar effects in 
the original mutation, requirement for a cluster of genes surrounding the interrupted gene, 
or toxic effects associated with overexpression of genes from multicopy plasmids. In 
order to overcome these problems, several alternative approaches can be used. 

One approach is to utilize an integrating E. coli - mycobacterial shuttle vector, 
pMV361 (Stover, C.K. et al (1991). Nature 151, 456-460). The vector integrates in a site- 
specific manner into the chromosomal attB site. This site is in a well-conserved part of 
the mycobacterial genome and has been identified in BCG, M. smegmatis f M. bovis, M. 
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chelonei, M. leprae, M. phlei, and M. tuberculosis. Prior to the use of this vector in M. 
marinum, the presence of the attB site in M. marinum is confirmed by Southern blot 
analysis of M marinum chromosomal DNA digested with BamWL using a radiolabeled 1 .7- 
kb Sal I attB fragment from M. smegmatis. In order to use this vector in mutants which 
contain the kanamycin resistance gene, the vector is modified to delete the kanamycin 
gene and to insert the bleomycin gene as was done, e.g., with the construction of 
pYUB213Akm (Ramakrishnan, L.H. et al (1997). J. Bacter. 122, 5862-5868). Using an 
integrating vector eliminates the possible instability seen with extrachromosomal plasmid 
maintenance in vivo (the integrated vector is stably maintained even without antibiotic 
selection), and the toxic effects associated with multicopy plasmids are reduced or 
eliminated since integration results in a single copy of the gene in the chromosome. To 
address the issue that the original transposon insertion phenotype was due to a polar effect 
on a downstream gene or that a cluster of genes is required for complementation, larger 
fragments of the original cosmid clone can be inserted into the integrating plasmid. 

Another approach is to construct by allelic exchange specific chromosomal 
mutations in the identified virulence genes. Methods for using long linear recombination 
substrates for allelic exchange are provided, e.g., in Balasubramanian, V. et al (1996). J. 
BacterioL 1 78 , 273-279. Other methods for homologous recombination are found, e.g., 
in Aldovini, A.R. et al (1993). J. BacterioL 125, 7282-7289; Norman, E. et al (1 995). Mol 
Microbiol 16, 755-760; Baulard, A. et al (1996). J. BacterioL JLZ8, 3091-3098; Marklund, 
B.I. et al (1995). J. BacterioL 122, 6100-6105; and Ramakrishnan, L. et al (1997). J. 
BacterioL 122, 5862-5868. These specific mutations allow the creation of non-polar 
mutations in the virulence genes. 

Example 9 - Identification and characterization of thirteen M. tuberculosis virulence genes 

DNA regions flanking transposon insertion points for 13 mutants were amplified 
by inverse PCR and sequenced. Predicted amino acid sequences from all six reading 
frames of the DNA sequences obtained were subjected to similarity search of the nr 
database, using the NCBI BLAST program. The nr database includes, e.g., all non- 
redundant GenBank CDS translations, PDB, SwissProt, PIR and PRF sequences. An 



WO 01/19993 



# 



mag 




»07220 



PCT/US00/25512 



-40- 



advanced BLAST search determined whether a homologous protein sequence was present 
in the Mycobacterium tuberculosis genome. The translated flanking sequences of mutants 
41.2, 80.1, 86.1, 62.2, 67.1, 80.8, 39.2, 114.7, 32.2, 42.2, 60.2, 68.6 and 95.3 exhibited 
sequence identities with functionally homologous proteins from M. tuberculosis of 93%, 
5 42%, 37-51%, 77%, 38%, 78%, 43%, 82%, 64%, 62%, 58-77%, 38%, and 36-47%, 
respectively. 

Gene 41.2 

The sequence of the flanking region of M. marinum mutant 41 .2 is as follows: 
5'- 

10 CGGGCCGATCTATGACGAGNACGACGGGACAGATGGGTCCCCGGATGGTC 
TA 

CACCGAGACCAAACTGAACTCGTCGTTCTCCTTCGGCGGGCCCAAGTGTCT 
GGTGAAGGTGATCCAAAAACTGTCCGGGTTGAGCATCAACCGGTTCATCGC 
CATCGACTTCGTCGG - 3' (SEQ ID NO: 4) 

1 5 This can be translated in the third reading frame to the following protein sequence: 

1 GRS MTXTTGQ MGPRMVYTET KLNSSFSFGG PKCLVKVIQK LSGLSINRFI 
51 AJDFV (SEQ ID NO: 5) 

The mutant (41.2), when tested individually in the goldfish model, exhibits 
attenuated virulence as compared to the wild type organism (See Figure 8). 

20 The gene interrupted in the attenuated mutant has been characterized by sequence 

analysis. Using the mycobacterium database, a functional homologue of this gene has 
been identified in M. tuberculosis ( emb | CAA 17628 | (AL022004); ( Rv0822c). Using the 
general genomic database, the gene has been shown to be most closely related to gene 
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emb|CAA 20411.j.; (AL031317), a transcriptional regulator of Streptomyces coelicolor 
which belongs to the AraC family of transcriptional regulators. This suggests that the gene 
identified as interrupted in mutant 41 .2 is a putative transcriptional regulator belonging to 
the AraC family. 

The proteins belonging to this family have at least three main regulatory functions 
in common: carbon metabolism, stress response, and pathogenesis. (See, e.g., Gallegos, 
M-T etal{\ 997). Microbiology and Molecular Biology Reviews 61, 393-4 1 0). Certain 
of these regulatory proteins are involved in the production of virulence factors in infections 
of plants or mammals. These regulatory factors have been found in microbes that colonize 
either the gastrointestinal, respiratory, or genitourinary tracts. These proteins are involved 
in stimulation of the synthesis of proteins that play a role in adhesion to epithelial tissues, 
components of the cell capsule, and invasins. Some members of the family control the 
production of other virulence factors. Some regulators are involved in the response to 
stressors, including oxidative stress and transition from exponential growth to the 
stationary phase. Without wishing to be bound by any mechanism, these observations 
suggest that the role of this gene in M, tuberculosis pathogenesis may be in invasion of the 
macrophage, survival in the macrophage (oxidative stress) or in transition to the latent 
state of tuberculosis (transition from exponential to stationary phase). 

Gene 80.1 

The sequence of the flanking region of M marinum mutant 80.1 is as follows: 
5'- 

ACCTCCTGAATGTGTGACATGGCCCTAGAACCCTGCNTTAGACTATTTACAT 
A 

CATGGCTTCACCCGGCCGCCTGTGCCACTCATAAGACTACTGGAATGGACC 

AACAATCGCACAGTCATCTGAAGCAGGAGTCTGTTAATCACAGGCCCTGAA 

GGAACAGTGACTGTGCAGAGAAAGACGGCAATGCATCCTGTTAACTAAGT 

GGCTGGAGGAGTGCCAGGTCATTCCAAAGAACATCCCTGAAATCTGGAGG 

AGAAGGTATAGTGAGCACCCCAAAATTTCAACTGGAGACATCANACCAGA 
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GTCTCTACTGAGCTGCCAAGCTTGCGGCCGCACTCGAGTAACTAGTTAACC 
CCTTGGGGCCTCTAAACGGGTCTTGA - 3 ' (SEQ ID NO: 6) 

This can be translated in the second reading frame to the following protein sequence: 
1 PPECVTWP*N PALDYLHTWL HPAACATHKT TGMDQQSHSH LKQESVNHRP 
5 1 *RNSDCAEKD GNASC*LSGW RSARSFQRTS LKSGGEGIVS TPKFQLETSX 
QSLY*AAKLA AALE*LVNPL GPLNGS* (SEQ ID NO: 7) 

The mutant (80.1), when tested individually in the goldfish model, exhibits 
attenuated virulence as compared to the wild type organism (See Figures 9 and 12). 

The gene interrupted in the attenuated mutant has been characterized by sequence 
analysis, as described above for mutant 41.2. Functional homologues of this gene have 
been identified in M. leprae ( sp | P54580 [ YV23 MYCLE : B2168 C2 209) and M. 
tuberculosis ( sp | Ol 1 162 1 YV23 MYCTU : CY20G9.23). Based on the sequence analysis, 
the gene identified as interrupted in mutant 80.1 is a hypothetical integral membrane 
protein, most closely related to a glutamate receptor channel, dbj | BAA022S4.1 (Dl 2822), 
from Mus tnusculus. 

Gene 86.1 

The sequence of the flanking region of M. marinum mutant 86.1 is as follows: 
5'-TCATCGCTAACCGGTTGAGCTACCGCCCGCACAGCGTGCCCATCATCTC 



CAACCTGACCGGCTCACTTGCCACAGTCGAGCAACTCACATCGCCCCGCTA 
TTGGGCACAGCATGTACGGGAGCCAGTGCGGTTTCATGACGGCGTTACCGG 
CTTGTTGGCAGGCGGAGAACA-3 ' (SEQ ID NO: 8) 
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This can be translated in the third reading frame to the following protein sequence: 

1IANRLSYRPHSVPIISNLTGSLATVEQLTSPR 
YWAQHVREPVRFHDGVTGLLAGGE (SEQ ID NO: 9) 



The mutant (86.1), when tested individually in the goldfish model, exhibits 
5 attenuation in virulence as compared to the wild type organism (See Figures 10 and 12). 

The gene interrupted in the attenuated mutant has been characterized by sequence 
analysis, as described above for mutant 41.2. A family of functional homologues of this 
gene has been identified in M. tuberculosis ( emb | CAB06094 1 Z83857 1 ppsE : 
emb | CAB06605 j Z 8472 5 | pks6 : emb | CAB09100 | Z95617 | pks9 : 
10 emb | CAB09098 [ Z95617 | p ks8; emb | CAB06103 | Z83858 1 pksl : pir j S73075 1 pks002c 
protein). Based on the sequence analysis, the gene identified as interrupted from mutant 
86.1 is a polyketide synthase gene, most closely related to polyketide synthase genes 
AF263912 {Streptomyces noursei) and AF015823 {Streptomyces venezuelae). 

Polyketides are lipid-like molecules that have potent biological activities. 

15 Examples of polyketides include antibiotics (erythromycin), immunosuppressants 
(rapamycin, FK506), antifungal agents (amphotericin B), antihelminthic agents 
(avermectin), and cytostatins (bafilomycin). A polyketide toxin has been recently 
described in Mycobacterium ulcerans (George, K.M. et al (1999). Science 282, 854-856) 
but no homologue was identified by sequence analysis in M. tuberculosis. Although it was 

20 recognized during analysis of the M. tuberculosis genome project that the genome contains 
a large number of polyketide synthesis genes, no polyketides from M. tuberculosis have 
been identified. That we have identified that a mutation in this gene attenuates the M 
marinum strain in virulence suggests that although a polyketide toxin has not been 
identified, a product of this synthesis pathway is responsible for virulence. Without 

25 wishing to be bound to any mechanism, these observations suggest that a product of the 
polyketide synthesis pathway may be responsible for the tissue destruction and 
immunological modulation characteristic of diseases such as leprosy and tuberculosis. 
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Gene 62.2 

The sequence of the flanking region of Af. marinum mutant 62.2 is as follows: 

GATCCGGTGCCGCCTTGACCGGCCGCGCCACCAGTACCGCCGACGCCGCCC 
T G 
GCCGCCGGCTTGTGCGGCTTGCGATGGGTCGGTGCTGTCGGTGCCGGTGCC 
TCCGGTGCCGCCTTGGCCTCCGGTTCCGCCGGTGCCGCCCTGGCCGCCGGC 
GCCTTGGATGCCGCCGGTGCCGGTTCCGGCTGCACCGCCCGTTCCGCCGGT 
TCCGCCTGCGCCGCCGGTGCCT (SEQ ID NO: 1 0) 



This can be translated in the -2 reading frame to the following protein sequence: 
227 ggcaccggcggcgcaggcggaaccggcggaacgggcggtgcagcc 

GTGGAGGT GGTGGAA 
1 82 ggaaccggcaccggcggcatccaaggcgccggcggccagggcggc 

GTGTGGIQGAGGQGG 
1 37 accggcggaaccggaggccaaggcggcaccggaggcaccggcacc 

TGGTGGQGGTGGTGT 
92 gacagcaccgacccatcgcaagccgcacaagccggcggccagggc 

DSTDPSQAAQAGGQG 
47 ggcgtcggcggtactggtggcgcggccggtcaaggcggcaccgga 

GVGGTGGAAGQGGTG (SEQ ID NO: 12) 
2tc 1 (SEQ ID NO: 11) 

The mutant (62.2), when tested individually in the goldfish model, exhibits 
attenuated virulence (reduced Competitive Index) as compared to the wild type organism 
(See Figure 12). 
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The gene interrupted in the attenuated mutant has been characterized by sequence 
analysis, as described above for mutant 41.2. Using either the mycobacterium or the 
general genomic database, a functional homologue of this gene has been identified in M 
tuberculosis ( emb | CAA1 7748.1 | (AL022022); ( Rv35 1 1 ). 

This is a hypothetical glycine-rich protein (Rv3511) belonging to a large M 
tuberculosis PE- PGRS protein family, which comprises roughly 5% of the coding DNA 
of M tuberculosis. The genes of this family are scattered throughout the genome of M. 
tuberculosis and other closely related mycobacteria. This family is characterized by a 
relatively conserved amino acid NH 2 -terminus. The function of these proteins is unknown 
but some hypotheses are that they represent a source of antigenic diversity or that their 
glycine repeats inhibit host major histocompatibility complex class I processing, akin to 
the glycine repeats of the Epstein-Barr virus EBNA-1 protein. That we have identified that 
a mutation in this gene attenuates the M. marinum strain in virulence suggests that the 
protein product of this gene is responsible for the immunological modulation characteristic 
of diseases such as leprosy and tuberculosis. 

Gene 67.1 

The sequence of the flanking region of M. marinum mutant 67.1 is as follows: 

GGTCGAAGACTATCGGTATGCTCCATAGCGTTCCGTCGGGAAGCTGCATGT 
TGTCAAGGGTTTCGTCGACCTCTCGGCGACCCATGAATCCCGATAGTGGCG 
TGAAGAAACCGTACGAGATGCTGATCACCTCGTGGGCGGTCGCCTTCGATA 
TCGGGATGCGCACCAATCCCTCAATCCGGCCGGCCACGTTTTCCCTTTCCAC 
CCTGTCGACGAGTGGGTGTCCGTTATGGCCTAAATAATCCATCTTGCTGCCT 
CTTTCTGAAATCGAATTTATTACTATCG (SEQIDNO:13) 



25 



This can be translated in the six reading frames to the following protein sequences: 
DNA: GGTCGAAGACTATCGGTATGCTCCATAGCGTTCCGTCGGGAAGCTGCATGT 
+3: SKTIGMLHSVPSGSCML 
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+2: VEDYRYAP*RSVGKLHV 



+ 1:G RRLSVCSIAFRREAAC 



DNA: TGTCAAGGGTTTCGTCGACCTCTCGGCGACCCATGAATCCCGATAGTGGCG 
+3: SRVSSTSRRPMNPDSGV 
5 +2: VKGFVDLSATHESR* WR 

+1:C QGFRRPLGDP*IPIVA 



DNA: TGAAGAAACCGTACGAGATGCTGATCACCTCGTGGGCGGTCGCCTTCGATA 
+3: KKPYEMLITSWAVAFDI 
+2: EETVRDADHLVGGRLRY 
10 +1: * R N R T R C * S P R G R S P S I 



DNA: TCGGGATGCGCACCAATCCCTCAATCCGGCCGGCCACGTTTTCCCTTTCCA 
+3: GMRTNPSIRPATFSLST 
+2: RDAHQSLNPAGHVFPFH 
+1: S G C A P I P Q S G R P R F P F P 



15 DNA: CCCTGTCGACGAGTGGGTGTCCGTTATGGCCTAAATAATCCATCTTGCTGC 
+3: LSTSGCPLWPK*SILLP 

+2:PVDEWVSVMA*IIHLAA 
+1:P CRRVGVRYGLNNPSCC 
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DNA: CTCTTTCTGAAATCGAATTTATTACTATCG 



(SEQIDNO: 13) 



+3: L S E I E F I T I 



(SEQ ID NO: 14) 



+2:SF*NRIYYY 



(SEQ ID NO: 15) 



+1:LFLKSNLLLS 



(SEQIDNO: 16) 



10 



DNA: CGATAGTAATAAATTCGATTTCAGAAAGAGGCAGCAAGATGGATTATTTAG 
-1:R ***IRFQKEAARWII* 

-2:DSNKFDFRKRQQDGLFR 

-3: IVINSISERGSKMDYLG 

DNA: GCCATAACGGACACCCACTCGTCGACAGGGTGGAAAGGGAAAACGTGGCCG 
-1:A ITDTHSSTGWKGKTWP 

-2:P*RTPTRRQGGKGKRGR 

-3: HNGHPLVDRVERENVAG 

DNA: GCCGGATTGAGGGATTGGTGCGCATCCCGATATCGAAGGCGACCGCCCACG 
-1:A GLRDWCASRYRRRPPT 

-2:PD*GIGAHPDIEGDRPR 

-3: RIEGLVRIPISKATAHE 
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DNA: AGGTGATCAGCATCTCGTACGGTTTCTTCACGCCACTATCGGGATTCATGG 
-1:R*SASRTVSSRHYRDSW 

-2:GDQHLVRFLHATIGIHG 
-3: VISISYGFFTPLSGFMG 

DNA: GTCGCCGAGAGGTCGACGAAACCCTTGACAACATGCAGCTTCCCGACGGAA 
-1:V AERSTKPLTTCSFPTE 

-2:SPRGRRNP*QHAASRRN 

-3: RREVDETLDNMQLPDGT 

DNA: CGCTATGGAGCATACCGATAGTCTTCGACC (SEQ ID NO: 1 7) 

-1:RYGAYR*SST (SEQ ID NO: 18) 

-2:AMEHTDSLR (SEQ ID NO: 19) 

-3: L W S I P I V F D (SEQ ID NO: 20) 

The mutant (67.1), when tested individually in the goldfish model, exhibits 
attenuated virulence as compared to the wild type organism (See Figures 12 and 13). 

The gene interrupted in the attenuated mutant has been characterized by sequence 
analysis, as described above for mutant 41.2. Using the mycobacterium database, a 
functional homologue of this gene has been identified in M. tuberculosis 
( emb | CAB08565. 1 1 (Z95324) purA. This homologue, in the +2 frame, with an identity 
38% (similarity of 57%), is an adenylosuccinate synthetase (M. tuberculosis homologue 
008381). This protein product plays an important role in the de novo pathway of purine 
nucleotide biosynthesis. Thus in the host animal, particularly in the macrophage where 
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nutrients may be limiting the product of this gene may be required for survival of 
Mycobacterium marinum and M. tuberculosis. 

Based on the sequence analysis to the entire genomic database, the gene identified 
as interrupted from mutant 67.1 is a sulfate adenylyltransferase with homology to diverse 
5 organisms including Pyrococcus abyssi, Synechocystis sp., and Bacillus subtilis. The 
homology is in the -3 reading frame of the translated gene product and shows 27-40% 
identity (51-62% similar). The homology noted to the sulfate adenylyltransferase enzymes 
suggests that mutant 67.1 is attenuated in its ability to respond to sulfate starvation as this 
enzyme is required for growth in defined synthetic medium with sulfate as a sulfur source. 
1 0 This suggests that in the animal host a sulfur source is limiting and thus interruption of this 
gene attenuates growth of the organism in the animal host. Thus interruption of this gene 
in a live attenuated Mycobacterium vaccine strain would be beneficial, as it will limit the 
ability of the vaccine strain to grow in the animal host. 



Gene 80.8 

15 The sequence of the flanking region of M. marinum mutant 80.8 is as follows: 

CCAATTAGCTGATTATTCCTCGGGCGTGCTCAACGCCAAGGACTACATATC 
AGGTTACTTCCACTAAAATTCGCGGGCCCCGATCGGCGACATTACTCGACG 
GTTTTCGGGGGAATCTCAGCGGTGATGGCATTCTTGAGGGCGACGTAGCGT 
TTGGCGTCGGGATC (SEQ ID NO: 21) 

20 This can be translated in the -1 reading frame to the following protein sequence: 
DPDAKJ^YVALK^AITAEIPPKTVE*CRRSGPANFSGSNLICSPWR*ARP 
NNQLI (SEQ ID NO: 22) 
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The mutant (80.8), when tested individually in the goldfish model, exhibits 
attenuation in virulence (reduced Competitive Index) as compared to the wild type 
organism (See Figure 12). 

The gene interrupted in the attenuated mutant has been characterized by sequence 
5 analysis, as described above for mutant 41. 2. Using either the mycobacterium or or the 
general genomic database, a functional homologue of this gene has been identified in M. 
tuberculosis ( emb | CAB02482. 1 | Z80343 | lipE. This is a probable carboxylic-ester 
hydrolase (M. tuberculosis homologue Rv3775) also referred to as an esterase or lipE. The 
homology is in the -1 reading frame with 83% similarity, 78% identity. This gene may 
10 have a role in fatty acid synthesis in Mycobacterium species or may be involved in 
establishment or dissemination in the animal host by destruction of the host cell fatty acids 
present in the host cell membrane. That we have identified that a mutation in this gene 
attenuates the M marinum strain in virulence suggests that the protein product of this gene 
is responsible for the virulence attributes of Mycobacterium species and may contribute 
15 to the establishment of diseases such as leprosy and tuberculosis. 



Gene 39.2 

The sequence of the flanking region of M. marinum mutant 39. is as follows: 

GATCCGCTGGACGGCACCAAAGAATTCATCAAGGGCAGCGATGAGTTCAC 
CGTCAACATCGCCCTGGTCGAGAACCAGGAACCCATTCTCGGGGCAATCTA 
20 CGGTCCAGCGAAGCAACTTCTGCACTACGCGGCCAAAGGGGCT (SEQ ID NO: 
23) 

This can be translated in the +1 reading frames to the following protein sequence: 

7 ctggacggcaccaaagaattcatcaagggcagcgatgagttcacc 
LDGTKEFIKGSDEFT 
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52 gtcaacatcgccctggtcgagaaccaggaacccattctcggggca 

VNIALVENQEP1LGA 
97 atctacggtccagcgaagcaacttctgcactacgcggccaaaggg 

IYGPAKQLLHYAAKG 
1 42 get 1 44 (SEQ ID NO: 43) 
A (SEQ ID NO: 24) 



The mutant (39.2), when tested individually in the goldfish model, exhibits 
attenuation in virulence as compared to the wild type organism (See Figure 14). 

The gene interrupted in the attenuated mutant has been characterized by sequence 
10 analysis, as described above for mutant 41.2. Using the mycobacterium database, a 
functional homologue of this gene has been identified in M. tuberculosis 
( emb | CAB06277.1 | Z8386 [ hypothetical protein Rv31371 This homologue, in the +1 
frame, with an identity 43% (similarity of 63%), is a probable inositol monophosphate 
phosphatase, because it contains an inositol monophosphatase family signature sequence. 
15 It is related to the cysQ proteins identified in the whole database search described below, 
which also belong to the inositol monophosphatase family. 

Based on a sequence analysis to the entire genomic database, the gene identified 
as interrupted from mutant 39.2 is predicted to be a structural protein of an ammonium 
transport system (also known as a cysQ gene). This protein affects the pool of 3'- 

20 phosphoadenosine -5'-phosphosulfate in the pathway of sulfite synthesis. The identity 
is in the +1 reading frame of the translated gene product and is 53-65% identical (63-82% 
similar). The homology noted suggests that mutant 39.2 is attenuated in its ability to 
respond to sulfate starvation as this enzyme is required for growth in defined synthetic 
medium with sulfate as a sulfiir source. This suggests that in the animal host a sulfur 

25 source is limiting and thus interruption of this gene attenuates growth of the organism in 
the animal host. Thus interruption of this gene in a live attenuated Mycobacterium vaccine 
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strain would be beneficial, as it will limit the ability of the vaccine strain to grow in the 
animal host. 

Gene 114.7 

The sequence of the flanking region of M. marinum mutant 1 1 4.7 is as follows: 

5 AGCCGTATTTCGCCATTGAGAGTTGGGGTCTTGAGATCGGCACTGGAAGGG 
GACAGCGTGCTATTGCCTCTTGGTCCGCCCTTGCCACCTGATGCTGTGGCGG 
CTAAACGGGGTGAGTCGGGGCTGCTCTGCGGCTTGTCGGTTCCGCTCAGCT 
GGGGTACGGCCGTTCCGCCGGATGACTACNACCATTGGGCACCGGAGCCTG 
AAGAAGGCGCCGAGGCCGTGGTCGAAGAAAACGTGGATGCGGCAGCTGCC 
1 0 GGTACCGACGAGTGGGACGAGTGGGCGGAATGGAGGGAGTGGGAGGC AG 
CAAATGCCCGAACCTCATTTTCGAGATGCCCCGTACCAGCAGCCGTGATAC 
CCGAACTCGCCGGCGGCCGGTTGAGA (SEQ ID NO: 25) 

This can be translated in the +1 reading frames to the following protein sequence: 
1 6 ttgagagttggggtcttgagatcggcactggaaggggacagcgtg 
15 LRVGVLRSALEGDSV 

6 1 ctattgcctcttggtccgcccttgccacctgatgctgtggcggct 

LLPLGPPLPPDAVAA 
1 06 aaacggggtgagtcggggctgctctgcggcttgtcggttccgctc 

KRGESGLLCGLSVPL 
20 151 agctggggtacggccgttccgccggatgactacnaccattgggca 

S W G T A V P P D D Y X H W A 
1 96 ccggagcctgaagaaggcgccgaggccgtggtcgaagaaaacgtg 
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PEPEEGAEAVVEENV 
24 1 gatgcggcagctgccggtaccgacgagtgggacgagtgggcggaa 

DAAAAGTDEWDEWAE 
286 tggagggagtgggaggcagcaaatgcccgaacctcattttcgaga 

WREWEAANARTSFSR 
33 1 tgccccgtaccagcagccgtgatacccgaactcgccggcggccgg 

CPVPAAVIPELAGGR 
376 ttgaga 381 (SEQ ID NO: 44) 

L R (SEQ ID NO: 26) 

The mutant (1 14.7), when tested in pools in the goldfish model, appears to exhibit 
attenuation in virulence as compared to the wild type organism. 

The gene interrupted in the attenuated mutant has been characterized by sequence 
analysis. Using either the mycobacterium or the general genomic database, a functional 
homologue of this gene has been identified in M. tuberculosis (pir E70662); (Rv2348c). 
The homology is in the +1 reading frame, with an identity of 82% (similarity 84%), to a 
hypothetical protein of M. tuberculosis. This protein is of unknown function as it has no 
known homology to any other sequence in the database. Extrapolating from the animal 
model, it appears that this gene is a virulence gene in M marinum and M. tuberculosis. 

Mutant 32.2 

The sequence of the flanking region of M marinum mutant 32.2 is as follows: 

TCCANNCAGAGGNGCACGTAGANCGTAGGACGGAANGCGGNGNGATCGNC 
AATACGGCTGGCNCTGCNAGAACTGNTCGAGGGCCTGCNGCTGGGGCC 
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(SEQ ID NO: 27) 

This can be translated in the -2 reading frame to the following protein sequence: 
APAAGPRXVLAXPAVLXIXPXSVLRSTCXSXW (SEQ ID NO: 28) 

The mutant 32.2, when tested individually in the goldfish model, exhibits 
attenuated virulence (reduced Competitive Index, see Figure 12) as compared to the wild 
type organism. 

The gene interrupted in the attenuated mutant has been characterized by sequence 
analysis. Using the Mycobacterium database, a functional homologue of this gene has 
been identified in M. tuberculosis (emb CAB06230 (Z83864) (Rv3860). This is a gene 
encoding a hypothetical protein of unknown function with homology to other 
Mycobacterium proteins also of unknown function including [emb CAB08086 (Z94121) 
(Rv3888c); emb CAA75199 (Y14967); emb CAA17968 (AL022120) (Rv3876); emb 
CAB08981 (Z95558) (Rv0530) and emb CAA15582 (AL008967) (Rv2787)]. That we 
have identified that a mutation in this gene attenuates the M marinum strain in virulence 
suggests that the protein product of this gene contributes to the disease process in 
tuberculosis and leprosy. The interruption of this gene in a live attenuated Mycobacterium 
vaccine strain would be beneficial, as it will limit the ability of the vaccine strain to grow 
in the animal host. 

The homology with the M. tuberculosis homologue is 64% identity, 78% 
similarity. 

Mutant 42.2 

The sequence of the flanking region of M. marinum mutant 42.2 is as follows: 

TTTGCAATCCACCTGTACGCGGAACTNTTNANNNCCGTTTTGCCTTGNCGA 
ATAAGCTAGCT (SEQ ID NO: 29) 



,072201 




WO 01/19993 



PCT/US00/25512 



-55- 



10 



15 



This can be translated in the -1 reading frame to the following protein sequence: 
S*LIRQGKTXXXSSAYRWIA (SEQ ID NO: 30) 

The mutant 42.2, when tested individually in the goldfish model, exhibits 
attenuated virulence (reduced Competitive Index, see Figure 12 and decreased virulence 
in LD50 experiment, Figure 15) as compared to the wild type organism. 

The gene interrupted in the attenuated mutant has been characterized by sequence 
analysis. Using the Mycobacterium database, a functional homologue of this gene has 
been identified in M. tuberculosis (emb CAB03756 (Z81371) (mbtB). This is a gene 
involved in mycobactin biosynthesis. M. tuberculosis produces both cell associated 
mycobactins and secreted, water-soluble mycobactins. Both types are siderophores and 
act to scavenge iron from the environment to support growth of the organism. The genes 
involved in mycobactin synthesis are contained in an operon. That we have identified that 
a mutation in this gene attenuates the M marinum strain in virulence suggests that iron is 
required for Mycobacterium growth in the animal host. The interruption of this gene in 
a live attenuated Mycobacterium vaccine strain would be beneficial, as it will limit the 
ability of the vaccine strain to grow in the animal host. 

The homology with the M. tuberculosis homologue is 62% identity, 99% 
similarity. 

Mutant 60.2 

The sequence of the flanking region of M. marinum mutant 60.2 is as follows: 
CCANACCTATCTGTTTNCAGNTTNAGACNACGGNATCTCACGCGNTTGGGC 



CCNGCCACCAAACGCCGCGTNGA 



(SEQ ID NO: 31) 
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This can be translated in six reading frames to the following protein sequences: 
DNA: CCANACCTATCTGTTTNCAGNTTNAGACNACGGNATCTCACGCGNTTGGGC 

+3: XPICXQXXTTXSHAXGP 

+2:XTYLFXXXDXGISRXWA 
5 +1:PXLSVXXXRXRXLTRLG 

DNA: CCNGCCACCAAACGCCGCGTNGA (SEQ ID NO: 3 1 ) 

+3: X H Q T P R X (SEQ ID NO: 32) 

+2: X P P N A A X (SEQ ID NO: 33) 

+1:PATKRRV (SEQ ID NO: 34) 

10 >60.2/T89 T87 removed 

DNA: TCNACGCGGCGTTTGGTGGCNGGGCCCAANCGCGTGAGATNCCGTNGTCTN 
-1:S TRRLVAGPXRVRXRXL 
-2:XRGVWWXGPXA*DXVVX 
-3: XAAFGGXAQXREXPXSX 

15 DNA: AANCTGNAAACAGATAGGTNTGG (SEQ ID NO: 35) 

-1: X L X T D R X (SEQ ID NO. 36) 

-2: X X-K Q I G X (SEQ ID NO: 37) 

-3: X X N R * V W (SEQ ID NO: 38) 
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The mutant 60.2, when tested individually in the goldfish model, exhibits 
attenuated virulence (reduced Competitive Index, see Figure 12) as compared to the wild 
type organism. 

The gene interrupted in the attenuated mutant has been characterized by sequence 
5 analysis. Using the Mycobacterium database, functional homologues of this gene have 
been identified in M tuberculosis [emb CAA 17485 (AL021957) (Rv2181); emb 
CAB06507 (Z84498) (Rvl954c); emb CAA17586 (AL021999) (Rv0987); emb 
CAB07087 (Z92771) (Rv3268); emb CAB08632 (Z95387) (Rv2610c)]. This is a gene 
encoding a hypothetical integral membrane protein of unknown function. That we have 
10 identified that a mutation in this gene attenuates the M. marinum strain in virulence 
suggests that it is required for Mycobacterium growth in the animal host. The interruption 
of this gene in a live attenuated Mycobacterium vaccine strain would be beneficial, as it 
will limit the ability of the vaccine strain to grow in the animal host. 

The homology with the M tuberculosis homologue Rv 2181 is 58% identity, 66% 
15 similarity, overall homology with all the genes identified is 58-77% identity, 66-88% 
similarity. 



Gene 68.6 

The sequence of the flanking region of M marinum mutant 68.6 is as follows: 

AAATCATCATCTATCGTTACCCGGGGCAAGCCAAGCACCTCAGCAAAAATT 
20 CTGCAGAGCATTTCCTCTTGCGGAGTTCGCGGCATACGGCCAATCGCCGCA 
TGATGATCGGGCACAGGCAGCGCTTTACGATCCACCTTCTTATTCGGAGTT 
AACGGCATGGTCTCAAGTCTTACGATGACAGACGGCACCATATATTCGGCC 
AGTTTCAGGGAGGCGTAGCGCCGCAGTTCTGCTGTATCTATCA 
(SEQ ID NO: 39) 



25 This can be translated in the -3 reading frame to the following protein sequence: 



aO«BS35& ,072202 
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1 IDTAELRRYA SLKLAEYMVP SVIVRLETMP LTPNKKVDRK ALPVPDHHAA 
IGRMPRTPQE EMLCRIFAEV LGLPRVTIDD D (SEQ ID NO: 40) 



The mutant (68.6), when tested individually in the goldfish model, exhibits 
attenuated virulence (reduced Competitive Index) as compared to the wild type organisms 
5 (Figure 12). 

The gene interrupted in the attenuated mutant has been characterized by sequence 
analysis. Using the mycobacterium database, a functional homologue of this gene has 
been identified in M. tuberculosis (pir E70751 emb CAA98937 Z74410); (nip protein). 
The homology is in the -3 reading frame, with an identity of 43% (similarity 62%), to a 

10 probable nrp protein of M. tuberculosis. This protein belongs to a superfamily of acetate 
CoA ligase proteins involved in peptide synthesis. A second protein of M. tuberculosis 
also shows significant homology. This protein is the mbtE protein (pir C70588 emb 
CAB08481 Z95208). The homology is again in the -3 reading frame, with an identity of 
38% (similarity 56%). This is a gene involved in mycobactin biosynthesis. M. 

15 tuberculosis produces both cell associated mycobactins and secreted, water-soluble 
mycobactins. Both types are siderophores and act to scavenge iron from the environment 
to support growth of the organism. The genes involved in mycobactin synthesis are 
contained in an operon. 

Searching against the entire database, we have identified significant homologues 
20 in Bacillus subtilis. The gene homologue is dhbF a gene encoding the 2,3- 
dihydroxybenzoate biosynthesis. The gene has been identified as essential for the 
synthesis of a siderophore in B. subtilis. 



Mutant 95,3 

The sequence of the flanking region of M marinum mutant 95.3 is as follows: 

25 GATTAGCTTATTCCTCAAGGCACGAGCGATTAGCTTATTCCTCAAGGCACG 
AGCGACTAGCTTATTCCTCAAGGCACGAGCTTCGCACTTGACGGTGTAGAG 



tr™-^, -rp~>« cr^i. ^ 
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CTCAATAGCTTATTCCTCAAGGCACGAGCTCGACTTCGCACTTGACGGTGT 
AG AGCTC AAAG (SEQ ID NO: 4 1 ) 



This can be translated in the +1 reading frame to the following protein sequence: 

1 D*LIPQGTSD*LIPQGTSD*LIPQGTSFALDGVELNSLFLKARARLRT*R 

52 CRAQ (SEQ ID NO: 42) 



The gene interrupted in the attenuated mutant has been characterized by sequence 
analysis. Using the Mycobacterium database, functional homologues of this gene have 
been identified in M tuberculosis [pir B70963 emb CAB0717 (Z92669) (Rv0236c); pir 
B70748 emb CAA98982 (Z74697) smc protein]. This is a gene encoding a hypothetical 
integral membrane protein of unknown fimction. That we have identified that a mutation 
in this gene attenuates the M. marinum strain in virulence suggests that it is required for 
Mycobacterium growth in the animal host. The interruption of this gene in a live 
attenuated Mycobacterium vaccine strain would be beneficial, as it will limit the ability 
of the vaccine strain to grow in the animal host. 

The homology with theM tuberculosis homologue Rv 0236c is 36% identity, 64% 
similarity and with the smc protein is 47% identity, 61% similarity. 

From the foregoing description, one skilled in the art can easily ascertain the 
essential characteristics of this invention, and without departing from the spirit and scope 
thereof, can make changes and modifications of the invention to adapt it to various usage 
and conditions. 

Without further elaboration, it is believed that one skilled in the art can, using the 
preceding description, utilize the present invention to its fullest extent. The preceding 
preferred specific embodiments are, therefore, to be construed as merely illustrative, and 
not limitative of the remainder of the disclosure in any way whatsoever. 



WO 01/19993 



PCT/US00/25512 



-60- 



The entire disclosure of all applications, patents and publications, cited above and 
in the figures are hereby incorporated by reference. 



